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Abstract 


A  typical  computer  user  today  manages  passwords  for  many  dif¬ 
ferent  online  accounts.  Users  struggle  with  this  task  —  often  forgetting 
their  passwords  or  adopting  insecure  practices,  such  as  using  the  same 
passwords  for  multiple  accounts  and  selecting  weak  passwords.  While 
there  are  many  books,  articles,  papers  and  even  comics  about  select¬ 
ing  strong  individual  passwords,  there  is  very  little  work  on  password 
management  schemes  —  systematic  strategies  to  help  users  create  and 
remember  multiple  passwords.  Before  we  can  design  good  password 
management  schemes  it  is  necessary  to  address  a  fundamental  ques¬ 
tion:  How  can  we  quantify  the  usability  or  security  of  a  password 
management  scheme.  One  way  to  quantify  the  usability  of  a  password 
management  scheme  would  be  to  conduct  user  studies  evaluating  each 
user's  success  at  remembering  multiple  passwords  over  an  extended 
period  of  time.  However,  these  user  studies  would  necessarily  be  slow 
and  expensive  and  would  need  to  be  repeated  for  each  new  password 
management  scheme.  Our  thesis  is  that  user  models  and  security  mod¬ 
els  can  guide  the  development  of  password  management  schemes  with 
analyzable  usability  and  security  properties.  We  present  several  results 
in  support  of  this  thesis.  First,  we  introduce  Naturally  Rehearsing  Pass¬ 
word  schemes.  Notably,  our  user  model,  which  is  based  on  research  on 
human  memory  about  spaced  rehearsal,  allows  us  to  analyze  the  us¬ 
ability  of  this  family  of  schemes  while  experimentally  validating  only 
the  common  user  model  underlying  all  of  them.  Second,  we  introduce 
Human  Computable  Password  schemes,  which  leverage  human  capa¬ 
bilities  for  simple  arithmetic  operations.  We  provide  constructions  that 
make  modest  demands  on  users  and  we  prove  that  these  constructions 
provide  strong  security:  an  adversary  who  has  seen  about  100  10-digit 
passwords  of  a  user  cannot  compute  any  other  passwords  except  with 
very  low  probability.  Our  password  management  schemes  are  pre¬ 
cisely  specified  and  publishable:  the  security  proofs  hold  even  if  the 
adversary  knows  the  scheme  and  has  extensive  background  knowl¬ 
edge  about  the  user  (hobbies,  birthdate,  etc.).  They  do  not  require 
any  significant  server-side  changes.  In  further  support  of  our  thesis, 
we  show  that  user  models  and  security  models  can  also  be  used  to 
develop  server-side  defenses  against  online  and  offline  attacks. 
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Chapter  1 


Overview 

1.1  Introduction 


A  typical  computer  user  has  many  different  online  accounts  which  require  some 
form  of  authentication.  While  passwords  are  still  the  dominant  form  of  authen¬ 
tication,  users  struggle  to  remember  their  passwords.  As  a  result  users  often 
adopt  insecure  password  practices  (e.g.,  reusing  the  same  password,  selecting 
common  passwords)  [39, 52, 75, 102]  or  end  up  having  to  frequently  reset  their  pass¬ 
words.  There  have  been  numerous  recent  examples  of  major  password  breaches 
[3,  6,  8,  9, 10, 11, 12, 13, 14, 28,  52, 141], 

An  adversary  may  crack  a  weak  password  in  an  online  attack  where  he  pretends 
to  login  as  a  legitimate  user  and  tries  as  many  password  guesses  as  the  site  permits 
him  to  try  before  he  is  locked  out.  If  the  cryptographic  hash  of  a  password  is  leaked 
or  stolen  an  adversary  will  be  able  to  mount  a  more  dangerous  attack  known  as  an 
offline  attack,  in  which  he  can  continue  guessing  the  user's  password  indefinitely. 
Unfortunately,  these  attacks  are  commonplace  (e.g.,  breaches  at  Zappos,  Linkedln, 
Sony,  Gawker  and  Adobe  have  affected  millions  of  users  [6,  8,  9,  11,  13,  14,  28]). 
Users  are  often  advised  to  pick  long  passwords  that  include  numbers,  special 
characters  and  capital  letters  to  protect  themselves  in  the  event  of  an  offline  attack 
[132],  Even  the  strongest  passwords  can  be  compromised  via  a  plaintext  password 
leak  attack,  which  could  occur  because  the  user  fell  prey  to  a  phishing  attack  or 
because  the  user  signed  into  his  account  on  an  infected  computer  or  due  to  software 
misconfigurations  (e.g.,  [5, 10, 12, 141]).  Users  are  typically  advised  against  reusing 
the  same  password  to  protect  themselves  in  the  event  of  a  plaintext  password  leak. 
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Password  Management  Schemes.  Informally,  a  password  management  scheme 
is  a  strategy  that  a  user  could  follow  to  create  and  remember  each  of  his  pass¬ 
words.  One  of  the  central  goals  of  this  thesis  is  to  develop  password  management 
schemes  which  can  be  implemented  on  "human  hardware."  A  good  password 
management  scheme  should  be  usable  and  secure.  Informally,  a  password  man¬ 
agement  scheme  is  usable  if  a  human  can  create  and  recall  passwords  without 
too  much  effort.  A  secure  password  management  scheme  must  provide  concrete 
security  guarantees  even  against  an  adversary  who  has  already  learned  one  or 
more  of  the  user's  passwords.  Before  we  can  design  good  password  manage¬ 
ment  schemes  it  is  necessary  to  address  two  fundamental  questions:  How  can  we 
quantify  the  usability  of  a  password  management  scheme?  How  can  we  quantify 
the  security  of  a  password  management  scheme?  While  it  is  straightforward  to 
introduce  a  quantitative  security  model  based  on  the  attack  scenarios  described 
earlier,  it  is  more  challenging  to  quantify  usability  because  our  understanding  of 
human  memory  is  incomplete.  One  way  to  evaluate  the  usability  of  a  candidate 
password  management  scheme  is  to  conduct  a  large  user  study.  However,  this 
would  make  the  design  process  slow  and  expensive  as  the  user  study  would  have 
to  evaluate  each  user's  success  at  remembering  multiple  passwords  over  an  ex¬ 
tended  period  of  time.  Our  goal  is  to  develop  a  quantitative  usability  model  so  that 
the  design  process  for  password  management  schemes  could  be  separated  from 
the  validation  of  the  usability  model.  In  Chapter  2  we  introduce  a  mathematical 
framework  for  quantifying  the  usability  of  a  password  management  scheme,  as 
well  as  a  mathematical  framework  for  quantifying  the  security  of  a  password  man¬ 
agement  scheme.  Our  usability  model  builds  on  cognitive  psychology  literature 
about  spaced  repetition  and  human  memory.  Using  these  models  we  develop  a 
novel  password  management  scheme.  Shared  Cues,  which  balances  security  and 
usability.  In  Chapter  3  we  develop  several  password  management  schemes  with 
even  stronger  security  properties  than  Shared  Cues  by  leveraging  the  user's  capac¬ 
ity  to  perform  simple  computations  (e.g.,  addition  modulo  10)  in  his  head,  and 
we  apply  our  usability  model  from  Chapter  2  to  help  quantify  the  usability  of  our 
human  computable  password  management  schemes. 


Defenses  for  a  System  Administrator.  In  this  thesis  we  also  suggest  several 
defenses  that  a  system  administrator  could  adopt  to  mitigate  the  threat  of  online 
and  offline  attacks.  One  way  to  defend  against  online  attacks  is  to  adopt  a  password 
composition  policy  which  specifies  the  passwords  that  a  user  may/may  not  select 
(e.g.,  one  common  policy  says  that  each  password  must  contain  at  least  one  capital 
letter  and  at  least  one  number).  The  goal  of  these  policies  is  to  ensure  that  an 
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online  adversary's  first  few  guesses  are  likely  wrong  by  disallowing  overly  popular 
passwords1.  One  natural  question  to  ask  is  whether  or  not  we  can  efficiently 
compute  the  best  password  composition  policy  given  sufficient  data  about  the 
password  preferences  of  our  users.  In  Chapter  5  we  initiate  the  algorithmic  study 
of  password  composition  policies  and  present  an  algorithm  to  find  the  optimal 
policy  with  positive  rules  (e.g.,  one  potential  positive  rules  policy  specifies  that  a 
password  is  allowed  if  it  satisfies  one  of  the  following  conditions:  1)  It  is  longer  than 
15  characters,  2)  It  is  longer  than  12  characters  and  contains  upper  and  lower  case 
letters,  or  3)  It  is  longer  than  9  characters  and  contains  numbers  as  well  as  upper 
and  lower  case  letters.).  In  Chapter  6  we  present  a  defense  against  offline  attacks 
called  GOTCHAs.  The  basic  idea  behind  GOTCHAs  is  to  exploit  hard  artificial 
intelligence  problems  to  ensure  that  human  feedback  is  necessary  to  validate  each 
different  password  guess.  This  dramatically  increases  the  adversary's  cost  during 
an  offline  attack. 


Organization.  We  first  state  our  thesis  in  Section  1.2.  In  the  remainder  of  this 
chapter  we  briefly  summarize  each  of  the  remaining  chapters  in  this  thesis.  In  these 
summaries  we  emphasize  how  each  of  our  proposed  defenses  would  be  used  in 
practice,  while  postponing  a  discussion  of  the  technical  details  to  later  chapters. 
Chapter  2  —  based  on  the  work  of  Blocki  et  al.  [33]  —  takes  the  perspective  of 
the  user  who  is  given  the  complex  task  of  creating  and  remembering  passwords 
for  multiple  accounts.  We  overview  Chapter  2  in  Section  1.3.  In  Chapter  4  we 
describe  an  ongoing  user  study  that  we  are  conducting  to  quantify  the  effects  of 
rehearsal  and  the  use  of  mnemonic  techniques  on  long  term  memory  retention. 
We  give  a  brief  overview  of  this  user  study  in  Section  1.5.  Chapter  3  —  based 
on  the  work  of  Blocki  et  al.  [36]  —  continues  the  line  of  research  from  Chapter 
2.  We  develop  even  more  secure  password  managment  schemes  by  considering 
schemes  in  which  the  user  must  perform  a  few  simple  computations  in  his  head 
to  compute  each  of  his  passwords.  We  overview  Chapter  3  in  Section  1.4.  In 
contrast  to  Chapters  2  and  3,  Chapters  5  and  6  take  the  perspective  of  a  system 
administrator  at  a  large  company  who  is  trying  to  protect  users  against  online  and 
offline  attacks.  Chapter  5  —  based  on  the  work  of  Blocki  et  al.  [34]  —  deals  with 
online  attacks.  We  overview  Chapter  5  in  Section  1.6.  Chapter  6  —  based  on  the 
work  of  Blocki  et  al.  [31]  —  deals  with  offline  attacks.  We  overview  Chapter  6  in 
Section  1.7. 

1  After  a  few  incorrect  guesses  the  server  can  temporarily  lock  the  user's  account  to  stop  the 
online  adversary. 
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1.2  Statement  of  Thesis 


We  argue  in  support  of  the  following  thesis: 

User  models  and  security  models  can  guide  the  development  of  human 
authentication  schemes  with  analyzable  usability  and  security  proper¬ 
ties. 

We  present  two  sets  of  results  in  support  of  this  thesis.  We  present  our  first  set  of 
results  in  Chapters  2  and  3.  While  there  are  many  articles,  books,  papers  and  even 
comics  about  selecting  strong  individual  passwords  [4, 46, 49, 82, 109, 132, 146, 162], 
there  is  very  little  work  on  password  management  schemes — systematic  strategies 
to  help  users  create  and  remember  multiple  passwords — that  are  both  usable 
and  secure.  One  of  the  primary  goals  of  this  thesis  is  to  develop  theoretical 
models  to  help  quantify  the  security  and  usability  of  a  password  management 
scheme  and  to  use  these  models  to  develop  better  password  management  schemes. 
A  good  password  management  scheme  should  be  provably  usable  and  should 
provably  result  in  secure  passwords.  Furthermore,  the  password  management 
scheme  needs  to  be  publishable,  meaning  that  the  security  proof  should  hold 
even  if  the  adversary  knows  the  password  management  scheme  that  our  user 
is  following.  We  present  models  to  quantify  the  usability  and  the  security  of  a 
password  management  scheme,  and  we  use  these  models  to  develop  password 
management  schemes  that  provably  balance  security  and  usability.  We  present 
our  second  set  of  results  in  support  of  our  thesis  in  Chapters  5  and  6,  where  we 
develop  and  analyze  defenses  for  offline  and  online  attacks  against  passwords. 
The  defenses  we  propose  follow  naturally  from  our  user  models  and  our  security 
models. 


1.2.1  User  Models 

A  user  model  may  either  specify  capabilities  of  the  user  or  describe  how  the  user 
will  behave  in  different  scenarios;  possibly  both.  In  Chapter  2  our  user  model 
consists  of  a  memory  assumption  and  a  visitation  schedule  for  each  of  the  user's 
accounts.  The  memory  assumption  states  that  a  user  is  capable  of  remembering 
a  story  if  he  follows  a  particular  rehearsal  schedule,  and  our  visitation  schedule 
specifies  how  often  a  user  will  naturally  rehearse  each  of  his  passwords.  This  user 
model  allows  us  to  quantify  the  usability  of  a  password  management  scheme  by 
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predicting  how  much  extra  effort  a  user  would  need  to  expend  to  remember  all  of 
his  passwords.  In  Chapter  3  we  expand  the  user  model  of  Chapter  2  by  assuming 
that  the  user  can  also  perform  simple  computations  (e.g.,  addition  modulo  10)  in 
his  head,  and  we  show  how  to  develop  even  more  secure  human  computable 
password  management  schemes.  In  Chapter  5  our  user  model  predicts  how 
users  will  update  their  passwords  in  response  to  restrictions.  While  our  user 
model  in  this  chapter  is  simple  and  intuitive  it  allows  us  to  develop  a  novel 
algorithm  to  find  the  optimal  password  composition  policy  to  defend  against 
online  attacks.  In  Chapter  6  our  user  model  specifies  a  task  that  a  human  can  do, 
but  a  computer  cannot  (e.g.,  imagine  objects  in  an  randomly  generated  inkblot 
image  and  recognize  those  same  objects  later).  We  certainly  do  not  claim  to 
provide  a  comprehensive  list  of  tasks  that  a  human  can  do,  but  a  computer  could 
not.  However,  we  are  able  to  develop  a  novel  defense  against  offline  attacks  called 
GOTCHAs  based  on  a  simple  security  assumption  (e.g.,  given  two  randomly 
generated  inkblot  images  and  a  human-generated  label  for  one  of  the  inkblot 
images  the  computer  cannot  accurately  predict  which  inkblot  image  was  labeled). 

Each  of  our  user  models  consists  of  an  assumption  about  the  user's  capabilities 
(e.g.,  a  user  is  able  to  remember  a  story  if  he  follows  a  given  rehearsal  schedule,  a 
user  is  able  to  imagine  objects  in  an  inkblot  image  and  recognize  those  same  objects 
later,  a  user  is  able  to  add  two  digits  modulo  10  in  his  head)  and/or  a  description 
of  the  user's  behavior  (e.g.,  how  will  a  user  change  his  password  in  response  to  a 
composition  policy,  how  often  will  a  user  login  to  each  of  his  accounts).  Borrowing 
terminology  from  logic,  our  goal  is  develop  user  models  that  are  sound  (e.g.,  users 
truly  possess  the  capabilities  specified  by  our  model),  but  not  necessarily  complete 
(e.g.,  users  may  have  many  other  capabilities  not  considered  by  our  models). 


Related  Work 

A  distinctive  goal  of  our  work  is  to  develop  a  quantitative  usability  model  so  that 
the  design  process  for  password  management  schemes  can  be  separated  from  the 
validation  of  the  usability  model.  In  contrast,  a  line  of  prior  work  on  usability  has 
focused  on  empirical  studies  of  user  behavior  including  their  password  manage¬ 
ment  habits  [52,  75, 102],  the  effects  of  password  composition  rules  (e.g.,  requiring 
numbers  and  special  symbols)  on  individual  passwords  [34, 101],  the  memorabil¬ 
ity  of  individual  system  assigned  passwords  [140],  graphical  passwords  [27,  48], 
and  passwords  based  on  implicit  learning  [38].  These  user  studies  have  been  lim¬ 
ited  in  duration  and  scope  (e.g.,  study  retention  of  a  single  password  over  a  short 
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period  of  time)  and  can  only  test  a  very  specific  hypothesis. 


Example.  As  an  example,  consider  the  suggestion  that  Randall  Munroe  gave  in 
his  popular  webcomic  XKCD  (See  Figure  1.1).  He  suggested  that  users  create 
their  passwords  by  picking  four  random  words  from  the  dictionary  and  creating 
a  story2.  A  user  study  conducted  by  Shay  et  al.  [140]  indicated  that  users  had 
more  difficulty  when  asked  to  remember  three  to  four  random  words  than  when 
they  were  asked  to  remember  5  to  6  random  characters.  However,  the  user  study 
was  only  able  to  test  a  very  specific  hypothesis:  that  users  have  less  difficulty  re¬ 
membering  5  to  6  random  characters  than  remembering  3  to  4  random  words  from 
a  specific  small  dictionary  when  the  users  are  not  given  instructions  about  how 
to  memorize  their  passwords.  This  leaves  open  a  host  of  other  questions.  What 
if  the  users  were  required  to  follow  specific  instructions  about  how  to  memorize 
their  words  (e.g.,  by  making  up  a  story)?  Would  it  still  be  harder  to  remember 
4  random  words?  What  if  we  used  a  larger  dictionary  with  fewer  words  per 
password?  Is  it  still  easier  to  remember  6  random  characters  than  to  remember 
2  random  words  from  a  larger  dictionary?  Most  importantly,  what  if  the  user  is 
memorizing  multiple  passwords? 


Usability  Models.  One  way  to  evaluate  the  usability  of  a  candidate  password 
management  scheme  would  be  to  conduct  a  large  user  study.  However,  this  would 
make  the  design  process  slow  and  expensive  as  the  user  study  would  have  to  eval¬ 
uate  each  user's  success  at  remembering  multiple  passwords  over  an  extended 
period  of  time.  In  Chapter  2  we  introduce  a  mathematical  framework  for  quanti¬ 
fying  the  usability  of  a  password  management  scheme,  as  well  as  a  mathematical 
framework  for  quantifying  the  security  of  a  password  management  scheme.  Our 
usability  model  allows  us  to  separate  the  design  process  from  the  validation  of  the 
usability  model.  Our  usability  model  builds  on  cognitive  psychology  literature 
about  spaced  repetition  and  human  memory[160].  In  Chapter  4  we  describe  our 
ongoing  user  study  to  test  the  usability  model  itself. 


2Munroe  does  not  deal  with  the  more  complicated  problem  of  creating  and  remembering 
multiple  passwords. 
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1.2.2  Quantitative  Security  Model 


We  present  a  game  based  security  model  for  a  password  management  scheme  in 
the  style  of  exact  security  definitions  of  Bellare  and  Rogaway  [26].  The  game  is 
played  between  a  user  (U)  and  a  resource-bounded  adversary  (J?l)  whose  goal  is 
to  guess  one  of  the  user's  passwords.  Our  game  models  three  commonly  occur¬ 
ring  breaches  (online  attack,  offline  attack,  plaintext  password  leak  attack).  Our 
security  model  is  fundamentally  different  from  metrics  like  guessing  entropy  (e.g.. 
How  many  guesses  does  an  adversary  need  to  guess  all  of  passwords  in  a  dataset 
[107]?)  and  partial  guessing  entropy  (e.g..  How  many  guesses  does  the  adversary 
need  to  crack  u-fraction  of  the  passwords  in  a  dataset  [39, 121]?  How  many  pass¬ 
words  can  the  adversary  break  with  /:>  guesses  per  account  [45]?),  which  take  the 
perspective  of  a  system  administrator  who  is  trying  to  protect  many  users  with 
password  protected  accounts  on  his  server.  For  example,  a  system  administrator 
who  wants  to  evaluate  the  security  effects  of  a  a  new  password  composition  policy 
may  be  interested  in  knowing  what  fraction  of  user  accounts  are  vulnerable  to 
offline  attacks.  By  contrast,  our  security  model  takes  the  perspective  of  the  user 
who  has  many  different  password  protected  accounts.  Our  user  wants  to  evaluate 
the  security  of  various  password  management  schemes  that  he  could  choose  to 
adopt.  He  is  not  worried  about  how  many  Amazon  passwords  could  be  cracked 
in  three  guesses.  Instead,  he  will  be  worried  about  whether  or  not  his  personal 
accounts  are  vulnerable. 


1.2.3  Developing  Human  Authentication  Schemes  with  Analyz- 
able  Security  and  Usability  Properties 

No  user  model  will  perfectly  capture  all  of  the  intricacies  of  human  memory  and 
behavior.  As  George  E.R  Box  famously  observed[44],  "essentially,  all  models 
are  wrong,  but  some  are  useful."  We  argue  that  our  quantitative  security  and 
usability  models  are  useful  because  they  allow  us  to  develop  usable  and  secure 
human  authentication  schemes.  In  Chapter  2.6  we  develop  a  novel  password 
management  scheme.  Shared  Cues,  which  balances  security  and  usability,  and  in 
Chapter  3  we  develop  several  password  management  schemes  with  even  stronger 
security  properties  than  Shared  Cues  by  leveraging  the  user's  capacity  to  perform 
simple  computations  (e.g.,  addition  modulo  10)  in  his  head.  In  both  of  these  cases 
insights  from  our  usability  model  in  Chapter  2  helped  us  to  optimize  for  usability 
while  we  were  developing  our  password  management  schemes.  In  Chapter  5  we 
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also  propose  a  simple  model  of  how  users  change  their  passwords  in  response 
to  a  password  composition  policy,  and  we  use  this  model  to  develop  an  efficient 
sampling  algorithm  to  find  the  most  secure  password  composition  policy. 


Analyzable  Usability  and  Security.  Because  our  goal  is  to  develop  quantitative 
security  and  usability  models  we  focus  on  password  management  schemes  that 
are  precisely  specified.  It  is  not  always  possible  to  quantify  the  security  or  usability 
of  a  password  management  scheme  that  is  not  precisely  defined.  As  an  example, 
consider  the  advice  provided  by  Computing  Facilities  at  the  School  of  Computer 
Science  at  Carnegie  Mellon  University3.  They  recommend  that  users  follow  the 
following  steps  to  create  their  passwords: 

1.  Make  up  a  sentence  you  can  easily  remember.  Some  examples: 

I  have  two  kids:  Jack  and  Jill. 

I  like  to  eat  Dave  &  Andy's  ice  cream. 

No,  the  capital  of  Wisconsin  isn't  Cheeseopolis! 

2.  Now  take  the  first  letter  of  every  word  in  the  sentence,  and  include 
the  punctuation.  You  can  throw  in  extra  punctuation,  or  turn  numbers 
into  digits  for  variety.  The  above  sentences  would  become: 

Ih2k:JaJ. 

IlteD&A'ic. 

N,tcoWi'C! 

...Please  don't  use  one  of  the  sentences  above  to  generate  your  pass¬ 
word. 

These  instructions  do  not  clearly  specify  how  the  user  should  select  the  sentences 
for  each  of  his  passwords.  Does  our  user  select  similar  (or  identical)  sentences 
for  several  (all)  of  his  passwords?  If  he  selects  similar  sentences  for  each  of  his 
passwords  then  it  may  be  easier  to  remember  all  of  his  passwords,  but  now  a 
plaintext  password  breach  at  one  of  the  user's  accounts  will  leave  our  user's  other 
passwords  vulnerable.  Does  he  pick  his  sentence(s)  from  a  favorite  poem,  book 
or  movie  instead  of  stringing  together  truly  random  words?  If  he  does  then  the 
sentence  may  be  more  memorable,  but  the  resulting  password(s)  will  be  easier 

3Source:  http : //www . cs . emu . edu/“help/security/choosing_passwords . html  (Retrieved 
May  5,2014. 
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for  an  adversary  to  break,  especially  if  the  adversary  has  background  knowledge 
about  the  user.  By  contrast,  the  password  management  schemes  that  we  present 
in  Chapters  2  and  3  are  defined  precisely  (e.g.,  if  we  ask  users  to  memorize  a  story 
we  specify  the  random  distribution  from  which  the  story  should  be  drawn,  and 
we  even  give  the  user  instructions  about  how  to  memorize  the  story). 
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1.3  Usable  and  Secure  Password  Management 


A  typical  computer  user  may  have  many  password  protected  accounts  (e.g.,  Ama¬ 
zon,  Google,  Linkedln)  that  require  authentication,  and  the  user  may  wish  to 
authenticate  using  a  diverse  set  of  computing  devices  (e.g.,  desktop,  personal  lap¬ 
top,  public  computer,  friend's  computer,  smartphone).  The  user  needs  to  account 
for  all  three  types  of  attacks:  online  attacks,  offline  attacks  and  plaintext  password 
leak  attacks.  If  there  is  a  password  breach  at  Linkedln  [13]  will  the  user's  password 
resist  offline  cracking  attempts?  If  the  user  borrows  his  friend's  malware-infected 
computer  to  login  to  Google  will  the  adversary  also  be  able  to  recover  the  user's 
password  for  Amazon?  Our  user  also  needs  to  worry  about  remembering  his 
password(s)  because  the  password-reset  process  is  often  costly  [158]. 

In  Chapter  2  we  introduce  quantitative  usability  and  security  models  to  guide 
the  design  of  password  management  schemes  —  systematic  strategies  to  help  users 
create  and  remember  multiple  passwords.  In  the  same  way  that  security  proofs 
in  cryptography  are  based  on  complexity-theoretic  assumptions  (e.g.,  hardness  of 
factoring  and  discrete  logarithm),  we  quantify  usability  by  introducing  usability 
assumptions.  In  particular,  password  management  relies  on  assumptions  about 
human  memory,  e.g.,  that  a  user  who  follows  a  particular  rehearsal  schedule 
will  successfully  maintain  the  corresponding  memory.  These  assumptions  are 
informed  by  research  in  cognitive  science  and  can  be  tested  empirically.  To  quantify 
the  usability  of  the  password  scheme  we  predict  how  much  'extra  effort'  a  user 
would  have  to  expend  to  remember  all  of  his  passwords.  We  say  that  a  user 
rehearses  a  secret  naturally  whenever  he  recalls  that  secret  to  log  into  one  of  his 
accounts.  If  a  user  does  not  get  sufficient  natural  rehearsal  for  a  secret  then  he 
will  need  to  be  reminded  to  rehearse  that  secret.  We  call  this  an  extra  rehearsal. 
Given  rehearsal  requirements  and  a  user's  visitation  schedule  for  each  account, 
we  can  predict  how  many  times  our  user  will  need  to  be  reminded  to  perform  extra 
rehearsals  to  ensure  that  he  remember  all  of  his  passwords.  Our  usability  model 
lead  us  to  a  key  observation:  password  reuse  benefits  users  not  only  by  reducing 
the  number  of  passwords  that  the  user  has  to  memorize  but,  more  importantly, 
by  increasing  the  natural  rehearsal  rate  for  each  password.  We  also  present  a 
security  model  which  accounts  for  the  complexity  of  password  management  with 
multiple  accounts  and  associated  threats,  including  online,  offline,  and  plaintext 
password  leak  attacks.  Observing  that  current  password  management  schemes 
are  either  insecure  or  unusable,  we  present  Shared  Cues  —  a  novel  password 
management  scheme  in  which  the  underlying  secrets  that  the  user  memorizes  are 
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strategically  shared  across  accounts  to  ensure  that  most  rehearsal  requirements 
are  satisfied  naturally  while  simultaneously  providing  strong  security  guarantees. 
Our  construction  uses  the  Chinese  Remainder  Theorem  to  strategically  share  the 
secrets  in  a  way  that  achieves  these  competing  goals. 

1.3.1  Overview 

We  are  developing  an  application  which  implements  the  Shared  Cues  password 
management  scheme. 


Memorizing  Person-Action-Object  Stories.  To  begin  using  our  application  the 
user  first  memorizes  several  randomly  generated  Person- Action-Object  (PAO)  sto¬ 
ries.  Figures  1.2  and  1.3  illustrate  this  process.  To  memorize  each  PAO  story  we 
show  the  user  four  images:  a  person,  an  action,  an  object  and  a  scene.  We  instruct 
the  user  to  imagine  the  PAO  story  taking  place  inside  the  scene. 

Spend  10  seconds  visualizing  each  story  in  your  head,  and  try  to 
make  it  as  vivid  as  possible  by  thinking  of  details.  For  example,  suppose 
that  you  see  the  story  President  Bush  is  flipping  a  leaf.  When  you  are 
picturing  this  story  in  your  head  you  should  try  to  answer  questions 
like  the  following:  Is  the  leaf  big  or  small?  What  color  is  the  leaf?  Is 
President  Bush  laughing  or  frowning? 

After  the  user  has  memorized  a  story  the  application  stores  the  images  of  the 
person  and  the  scene,  but  discards  the  images  of  the  action  and  object.  The  images 
of  the  person  and  the  scene  will  be  used  as  a  public  cue  to  help  the  user  remember 
the  secret  action  and  object.  We  emphasize  that  actions  and  the  objects  in  each 
of  these  stories  are  randomly  chosen  by  the  computer  not  by  the  user.  If  the  user 
selected  the  action  and  the  the  object  then  he  might  pick  words  that  are  correlated 
with  person  or  the  scene  (e.g.,  in  Figure  1.2  the  user  might  pick  the  object  'apple' 
or  'penguin'  because  these  words  are  correlated  with  Steve  Jobs  and  the  glacier 
respectively).  By  having  the  computer  select  the  story  we  can  ensure  that  the 
action  and  object  are  not  correlated  with  the  scene  or  the  person  that  the  user  is 
shown. 


Creating  an  account.  To  help  the  user  create  a  password  for  an  account  the 
application  first  chooses  four  of  the  PAO  stories  (see  Chapter  2  for  more  details). 
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Figure  1.2:  Example  1.  Memorizing  a  Person- Action-Object  Story 


Figure  1.3:  Example  2.  Memorizing  a  Person- Action-Object  Story 
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pmer.Simpson  Grand.Canyon 


Bart.Sim 


in  Niagara.? alls 


Steve  lobs 


password 


e  n  vo 


Type  in  your  Password 


Figure  1.4:  Login  Example  1 


The  user  is  shown  the  images  of  the  person  and  scene  in  each  of  the  four  stories. 
To  form  his  password  the  user  remembers  the  secret  action  and  object  associated 
with  each  story  and  concatenates  all  of  these  words  together.  Figures  1.4  and  1.5 
illustrate  this  process  for  two  different  accounts.  Observe  that  these  two  accounts 
have  one  common  PAO  story  (the  story  involving  Bart  Simpson  at  the  Niagara 
Falls).  By  sharing  stories  across  accounts  we  can  cut  down  on  the  number  of  PAO 
stories  that  the  user  has  to  memorize.  In  Chapter  2  we  show  that  this  can  be  done 
in  a  way  that  preserves  strong  security  properties.  The  application  keeps  track 
of  which  stories  are  used  for  each  account,  but  does  not  store  any  of  the  user's 
passwords. 


Logging  into  an  account.  Logging  into  an  account  is  similar  to  creating  an  ac¬ 
count.  When  the  user  wants  to  login  to  an  account  the  application  displays  the 
images  of  the  person  and  scene  in  each  of  the  four  stories  associated  with  that 
account  (e.g.,  see  Figure  1.4).  We  stress  that  these  images  will  be  the  same  images 
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r 

[arilyn.Monroe  baseball.field 


;arden 


Angelinajolie  swimmin) 


Bart.Simi 


>n  Niagara.Falk 


Type  in  your  password 


Type  in  your  Password 


Figure  1.5:  Login  Example  2 
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that  the  user  saw  when  he  created  the  account.  To  recreate  his  password  the  user 
remembers  the  secret  action  and  object  associated  with  each  story  and  concatenates 
all  of  these  words  together. 


Helping  the  user  remember  all  his  stories.  Our  application  keeps  track  of  when 
the  user  rehearses  each  of  his  PAO  stories.  A  user  naturally  rehearses  a  PAO  story 
whenever  he  uses  that  story  to  login  to  one  of  his  accounts.  If  a  user  has  not 
rehearsed  a  PAO  story  in  a  long  time  then  our  application  will  remind  that  user  to 
rehearse  the  story.  We  call  this  an  extra  rehearsal.  Observe  that  the  two  accounts 
illustrated  in  Figures  1.4  and  1.5  have  one  common  PAO  story  (the  story  involving 
Bart  Simpson  at  the  Niagara  Falls).  By  reusing  stories  for  different  accounts  we 
can  minimize  the  total  number  of  stories  that  the  user  needs  to  remember  and 
maximize  the  frequency  of  natural  rehearsal  for  each  story.  Whenever  possible, 
our  application  ensures  that  each  story  is  used  as  part  of  the  password  for  a 
frequently  visited  account. 


Security  Guarantees.  As  an  example  suppose  that  the  user  is  willing  to  memorize 
9  PAO  stories.  Our  application  can  help  the  user  generate  126  different  passwords, 
while  providing  our  user  with  the  following  modest  security  guarantee:  any 
adversary  who  has  seen  one  of  your  passwords  will  not  be  able  to  break  any  of 
your  other  passwords  in  an  online  attack  except  with  small  probability.  If  the 
user  is  willing  to  memorize  43  PAO  stories  then  our  application  can  help  the  user 
generate  110  different  passwords,  while  providing  the  following  much  stronger 
security  guarantee:  any  adversary  who  has  seen  one  of  your  passwords  will  not 
be  able  to  break  any  of  your  other  passwords  in  an  offline  attack. 


1.4  Human  Computable  Passwords 

While  the  Shared  Cues  password  management  scheme  from  Chapter  2  only  relies 
on  the  human  capacity  to  memorize  and  retrieve  information.  Shared  Cues  is  secure 
against  at  most  a  constant  number  of  plaintext  password  leak  attacks.  Could  we 
improve  security  (or  usability)  by  having  the  user  perform  simple  computations  to 
recover  his  passwords?  In  Chapter  3  we  propose  a  human  computable  password 
scheme  and  provide  strong  evidence  that  the  user's  passwords  will  remain  secure 
even  after  many  (e.g.,  50-100)  breaches. 
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Figure  1.6:  Secret  Random  Mapping  from  Pictures  to  Digits 


(a)  Original  photo  (an  ea-  (b)  Mnemonic  to  help  the  (c)  Mnemonic  to  help  the 
gle).  user  remember  user  remember 

a  (eagle)  =  2.  o  (eagle)  =  6. 

Figure  1.7:  Mnemonics  to  help  memorize  the  secret  mapping  o 

1.4.1  Overview 

Memorizing  a  Random  Mapping.  To  begin  using  our  human  computable  pass¬ 
word  schemes  the  user  begins  by  memorizing  a  secret  random  mapping  a  :  \n\  — » 
{0, ...,9}  from  n  objects  (e.g.,  letters,  pictures)  to  digits.  See  Figure  1.6  for  an 
example. 

The  computer  can  provide  the  user  with  mnemonics  to  help  memorize  the 
secret  mapping  o  —  see  Figures  1.7b  and  1.7c.  For  example,  if  we  wanted  to  help 
the  user  remember  that  o  (eagle)  =  2  we  would  show  the  user  Figure  1.7b.  We 
observe  that  a  10  x  n  table  of  mnemonic  images  would  be  sufficient  to  help  the 
user  memorize  any  random  mapping  o.  We  stress  that  the  computer  will  only  save 
the  original  image  (e.g..  Figure  1.7a).  The  mnemonic  image  (e.g..  Figure  1.7b  or 
1.7c)  would  be  discarded  after  the  user  memorizes  o  (eagle). 


Single-Digit  Challenges.  In  our  scheme  the  user  computes  each  of  his  pass¬ 
words  by  responding  to  a  sequence  of  single-digit  challenges.  A  single-digit 
challenge  is  a  tuple  C  e  [n]14  of  fourteen  objects.  See  Figure  1.8  for  an  exam¬ 
ple.  To  compute  the  response  f(o(C))  to  a  challenge  C  =  {x0,...,Xi3}  the  user 
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1 


2 

3 

4 


Figure  1.8:  A  single-digit  challenge 


computes  /  (o  (C))  =  o  {xa{XlQ)+a{Xll)  mod  10)  +  o  (x12)  +  a  (x13)  mod  10.  Observe  that 
this  computation  involves  just  three  addition  operations  modulo  ten.  See  Fig¬ 
ure  1.9  for  an  example.  In  this  example  the  response  to  the  challenge  C  =  {x0  = 
burger,  X\  =  eagle, . . . ,  xw  =  lightning,  xn  =  dog,  x12  =  man  standing  on  world, 
Xi3  =  kangaroo}  is 


f(o(Q) 


o  (xa{Xw)+a{Xn)  mod  10)  +  O  (xvl)  +  O  (xi3)  mod  10 

°  (Xa(lightning)+u(dog)  mod  10) 

+o  (man  standing  on  world)  +  o  (kangaroo)  mod  10 
o  (xg+3  mod  10)  +  o  (man  standing  on  world)  +  o  (kangaroo)  mod  10 
0  (minions)  +  a  (man  standing  on  world)  +  0  (kangaroo)  mod  10 
7  +  4  +  5  mod  10  =  6. 


We  stress  that  this  computation  is  done  entirely  in  the  user's  head.  It  takes  the 
author  of  this  thesis  7.5  seconds  on  average  to  compute  each  response. 


Creating  an  Account.  To  help  the  user  create  an  account  the  computer  would  first 
pick  a  sequence  of  single-digit  challenges  C\, . . . ,  Q,  where  the  security  parameter 


Computed  Response: 

<r  [|J]+  a-  (jpg]  =  9+3  mod  10  =  2 


=  7  +  4  +  5  mod  10  =  6 


Figure  1.9:  Computing  the  response  (/  (a  (C))  = 


is  typically  t  =  10,  and  would  display  the  first  challenge  C\  to  the  user  —  see  Figure 
1.10  for  an  example.  To  compute  the  first  digit  of  his  password  the  user  would 
compute  /  (<j  (Ci)).  After  the  user  types  in  the  first  digit  /  ( a  (Q))  of  his  password 
the  computer  will  display  the  second  challenge  C2  to  the  user  —  see  Figure  1.11. 
After  the  user  creates  his  account  the  computer  will  store  the  challenges  Ci, . . . ,  C10 
in  public  memory.  The  password  piv  =  f  (a  (Ci)) . . .  /  (c r  (Cf))  will  not  be  stored. 


Authentication.  Authenticating  is  very  similar  to  creating  an  account.  To  help 
the  user  recompute  his  password  for  an  account  the  computer  first  looks  up  the 
challenges  C\, . . . ,  Ct  which  were  stored  in  public  memory,  and  the  user  authen¬ 
ticates  by  computing  his  password  piv  =  f  (0  (Ci )).../  {o  (Cf)).  We  stress  that 
the  single-digit  challenges  the  user  sees  during  authentication  will  be  the  same 
single-digit  challenges  that  the  user  saw  when  he  created  the  account. 


Helping  the  user  remember  his  secret  mapping.  As  before  the  computer  keeps 
track  of  when  the  user  rehearses  each  value  of  his  secret  mapping  (e.g.,  ( i ,  o  ( i ))  for 
each  i  6  [n]),  and  reminds  the  user  to  rehearse  any  part  of  his  secret  mapping  that 
he  hasn't  used  in  a  long  time.  One  advantage  of  our  human  computable  password 
scheme  (compared  with  the  Shared  Cues  scheme  of  Chapter  2)  is  that  most  users 


19 


Figure  1.11:  Login  Screen  after  the  user  responds  to  the  first  single-digit  challenge 
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will  use  each  part  of  their  secret  mapping  often  enough  that  they  will  not  need 
to  be  reminded  to  rehearse  —  see  discussion  in  Chapter  3.6.  The  disadvantage  is 
that  we  require  the  user  to  spend  extra  effort  computing  his  passwords  each  time 
he  authenticates. 


Security  Guarantees.  The  security  guarantees  of  our  human  computable  pass¬ 
word  scheme  are  much  stronger  than  the  security  guarantees  of  Shared  Cues.  We 
provide  strong  evidence  that  any  polynomial  time  adversary  needs  to  see  at  least 
£l(n13/t )  of  the  user's  passwords  before  he  can  start  to  predict  the  user's  pass¬ 
words  at  other  accounts.  For  example,  if  the  user  memorized  a  secret  mapping 
from  one-hundred  pictures  to  digits  then  the  adversary  would  need  to  see  ap¬ 
proximately  one-hundred  of  the  user's  ten  digit  passwords  before  he  could  start 
predicting  the  user's  passwords  for  other  accounts. 


Technical  Contributions.  We  develop  a  general  framework  for  analyzing  the 
security  of  a  human  computable  password  management  scheme  and  we  propose 
two  candidate  human  computable  password  management  schemes  in  Chapter  3. 
We  give  evidence  that  a  human  computable  password  management  scheme  will 
remain  secure  until  the  adversary  has  seen  at  least  Q  (yzsOj  challenge-response 
pairs  (C,f  (a  (C)))  (see  Theorem  6  in  Chapter  3).  Here,  s(f)  =  mm{r(f)/2,g(f)  +  1} 
is  a  composite  security  parameter  which  captures  g(f)  (how  many  inputs  to  /  need 
to  be  fixed  to  make  /  linear?)  and  r(f)  (what  is  the  largest  value  of  r  such  that 
the  distribution  over  challenge-response  pairs  are  (r  -  l)-wise  independent?).  We 
show  that  s(f)  =  1.5  for  our  first  scheme  and  s(f)  =  2  for  our  second  scheme.  In 
particular  we  prove  that  any  statistical  adversary  needs  to  see  at  least  Q  [nr(-f)l 
challenge-response  pairs  (C,/(cr(C)))  before  he  can  even  approximately  recover 
the  secret  mapping  o.  Our  lower  bound  is  based  on  the  statistical  dimension  of  the 
distribution  over  challenge-response  pairs  induced  by  /  and  a.  We  stress  that  our 
analysis  of  the  statistical  dimension  applies  to  arbitrary  functions  /  :  Z^  — »  Z <*,  not 
just  functions  that  are  easy  for  humans  to  compute.  Our  analysis  of  the  statistical 
dimension  generalizes  recent  results  of  Feldman  et  al.  [73],  which  only  applied  to 
binary  predicates  (e.g.,  d  =  2),  and  may  be  of  independent  interest. 
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1.5  Empirical  Validation  of  User  Model 


We  are  currently  running  user  studies  to  empirically  test  the  usability  model 
we  presented  in  Chapter  2.4  and  obtain  estimates  of  rehearsal  parameters  that 
are  specific  to  the  password  setting.  We  are  collaborating  with  the  CyLab  Usable 
Privacy  and  Security  Laboratory  (CUPS)  to  conduct  these  online  user  studies  using 
Amazon's  Mechanical  Turk  framework.  In  Chapter  4  we  describe  the  design  of  the 
user  study  and  present  initial  results  from  the  study.  Briefly,  each  participant  in  the 
study  is  asked  to  memorize  several  randomly  selected  actions  (e.g.,  'swallowing,' 
'kicking')  and  several  randomly  selected  objects  (e.g.,  'bike,'  'car').  Participants 
assigned  to  the  mnemonic  group  were  given  specific  instructions  about  how  to 
memorize  the  actions  following  the  Shared  Cues  password  management  scheme  in 
Chapter  2.6.  To  help  our  participants  memorize  one  of  their  action(s)  and  object(s) 
each  participant  was  shown  two  additional  photos  of  a  person  and  a  scene  and 
was  asked  to  imagine  the  corresponding  person-action-object  story  taking  place 
inside  the  scene  (e.g.,  the  user  might  be  shown  a  photos  of  Bill  Gates  and  a 
beach  and  asked  to  imagine  "Bill  Gates  swallowing  a  bike  on  the  beach.").  Other 
participants  were  assigned  to  the  standard  group  and  were  simply  instructed  to 
memorize  their  actions  and  objects  (e.g.,  by  typing  in  their  words  several  times). 
After  participants  memorized  their  words  we  periodically  asked  them  to  return 
to  rehearse  their  words.  During  each  rehearsal  participants  in  the  mnemonic 
group  were  shown  the  photos  of  the  person  and  the  scene  as  a  cue  to  help  them 
remember  the  associated  action  and  object.  Participants  in  the  standard  group  were 
simply  asked  to  recall  their  actions  and  objects.  Each  participant  was  assigned  a 
specific  rehearsal  schedule  (e.g.,  participants  in  the  aggressive  rehearsal  group 
were  reminded  to  rehearse  on  the  following  days:  1, 2, 4, 8, 16, 32, 64).  Because  the 
duration  of  the  study  is  up  to  one-hundred  days  we  do  not  yet  have  the  full  results 
from  the  user  study.  In  Chapter  4.4  we  report  the  results  for  rehearsals  that  have 
been  completed.  Our  results  support  the  hypothesis  that  recall  is  significantly 
improved  by  asking  users  to  follow  specific  mnemonic  techniques  to  memorize 
their  actions  and  objects.  Our  results  also  demonstrate  the  benefit  of  having  several 
early  rehearsals. 
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1.6  A  Defense  against  Online  Attacks 


To  guard  against  online  attacks  many  organizations  adopt  a  /c-s trikes  policy  in 
which  the  user  is  locked  out  of  his  account  after  k  incorrect  guesses4.  However, 
this  defense  is  often  insufficient  because  many  users  select  trivial  passwords  like 
'password'  and  '123456'  [5].  To  provide  further  protection  against  online  attacks 
organizations  often  adopt  password  composition  policies.  A  password  composi¬ 
tion  policy  is  a  set  of  rules  specifying  which  passwords  users  are  allowed  to  select 
and  which  passwords  users  are  not  allowed  to  select. 

In  Chapter  5  we  initiate  the  algorithmic  study  of  password  composition  policies. 
Such  policies  restrict  the  space  of  passwords  to  a  subset  of  allowed  passwords  and 
force  each  user  to  pick  a  password  in  this  subset.  Thus,  n  users  induce  a  distribution 
over  passwords  where  for  a  password  w,  Pr[ttt]  =  ^  |{z  :  i  picks  zc}|.  By  declaring 
different  subsets  of  allowed  passwords,  different  password  composition  policies 
induce  different  distributions.  Our  work  formalizes  and  addresses  the  algorithmic 
problem  a  server  administrator  faces  when  designing  a  password  composition 
policy;  we  ask: 

In  what  settings  can  the  information  about  the  users'  preferences  over  pass¬ 
words  allow  ns  to  design  a  password  composition  policy  that  is  guaranteed 
to  induce  a  password  distribution  as  close  to  uniform  as  possible? 


1.6.1  Overview 

Suppose  that  a  server  administrator  has  created  m  candidate  positive  rules  R\, . . . ,  Rm/ 
where  each  positive  rule  Ri  c  P  specifies  a  subset  of  passwords  that  the  server 
administrator  expects  to  be  strong  (e.g.,  "the  set  of  all  passwords  that  are  at  least  10 
characters  long  and  include  at  least  one  number,  at  least  one  lowercase  character 
and  at  least  one  uppercase  character,"  or  "the  set  all  passwords  longer  than  12 
characters").  A  password  composition  policy  is  given  by  a  subset  S  c  [m\  of  active 
rules.  The  user  is  allowed  to  choose  only  passwords  from  the  set  U ies^i  °f  Per_ 
mitted  passwords  (e.g.,  any  password  contained  in  an  active  rule).  Observe  that 
there  are  2m  different  password  composition  policies  that  we  could  form.  Which 

4Other  organizations  instead  require  the  user  to  solve  a  CAPTCHA  [152]  after  several  wrong 
guesses  [60].  This  prevents  an  adversary  from  maliciously  locking  out  another  user  (e.g.,  a  com¬ 
peting  bidder  on  eBay). 
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of  these  policies  is  optimal  for  security?  In  Chapter  5  we  give  a  sampling  algo¬ 
rithm  to  find  the  optimal  password  composition  policy.  A  sampling  algorithm 
is  an  algorithm  that  is  allowed  to  sample  a  random  user  and  ask  that  user  what 
password  he  would  select  under  a  particular  password  composition  policy  [jieS  R{. 
Our  algorithm  is  efficient  both  in  its  sample  complexity  and  in  its  running  time. 


Negative  Rules.  We  could  also  specify  a  password  composition  policy  using 
negative  rules  (e.g.,  the  user  can  only  select  passwords  in  the  set  P  -  U/es^O-  A 
negative  rule  specifies  a  subset  of  passwords  that  the  server  administrator  expects 
to  be  weak  (e.g.,  "the  set  of  all  passwords  that  are  not  longer  than  12  characters"). 
In  Chapter  5  we  show  that  it  is  computationally  intractable  to  find  the  optimal 
policy  in  this  negative  rules  setting. 


Experiments.  We  tested  our  algorithm  on  a  dataset  of  32  million  user  passwords 
using  a  small  set  of  rules  to  find  the  optimal  password  composition  policy  in  the 
positive  rules  setting.  Because  we  only  considered  twenty-one  different  rules  we 
were  also  also  able  to  find  the  optimal  password  composition  in  the  negative  rules 
setting  by  brute  force  search  (If  we  used  the  positive  rule  "all  passwords  longer 
than  9  characters"  then  the  negative  version  of  that  rule  would  be  "all  passwords 
that  are  not  longer  than  9  characters").  The  optimal  policy  in  the  positive  rules 
setting  was  to  allow  any  password  pzv  €  P  that  satisfies  any  of  the  following 
conditions:  1)  pzv  is  at  least  14  characters  long,  2)  pzv  contains  at  least  2  special 
symbols  (e.g.,  !,*,&,@),  OR  3)  pzv  is  at  least  8  characters  long  and  contains  at  least 
one  upper  case  letter  and  at  least  one  digit.  The  optimal  policy  in  the  negative 
rules  setting  was  to  allow  any  password  pzv  G  P  that  satisfies  all  of  the  following 
conditions:  1)  pzv  is  at  least  10  characters  long,  2)  pzv  contains  at  least  2  digits,  3) 
pzv  contains  at  least  one  special  symbol  (e.g.,  !,*,&,©),  4)  pzv  contains  at  least  one 
lowercase  letter,  AND  5)  pzv  is  not  in  the  dictionary.  The  optimal  positive  rules 
policy  was  nearly  as  good  as  the  optimal  negative  rules  policy.  We  also  used  an 
efficient  heuristic  algorithm  to  find  a  good  policy  in  the  negative  rules  setting. 
The  optimal  positive  rules  policy  was  consistently  far  better  than  the  negative 
rules  policies  returned  by  our  heuristic  algorithm.  Our  experiments  indicate  that 
it  may  be  advantageous  to  find  the  optimal  positive  rules  policy  whenever  we 
have  a  large  set  of  potential  rules  and  are  not  able  to  find  the  optimal  negative 
rules  policy  by  brute  force.  One  limitation  of  these  experiments  is  that  they  rely 
on  an  assumption  about  the  way  user's  select  passwords.  See  Section  5.1  for  more 
discussion  about  this  normalized  probabilities  assumption,  and  see  Section  5.6  for 
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more  details  about  the  experiments. 


1.7  A  Defense  Against  Offline  Attacks 

1.7.1  Background 

Any  adversary  who  has  obtained  the  cryptographic  hash  of  a  user's  password  can 
mount  an  automated  brute-force  attack  to  crack  the  password  by  comparing  the 
cryptographic  hash  of  the  user's  password  with  the  cryptographic  hashes  of  likely 
password  guesses.  This  attack  is  called  an  offline  dictionary  attack,  and  there  are 
many  password  crackers  that  an  adversary  could  use  [63].  Offline  dictionary  at¬ 
tacks  against  passwords  are  —  unfortunately  —  powerful  and  commonplace  [87], 
Offline  attacks  are  becoming  increasingly  dangerous  as  computing  hardware  im¬ 
proves  (e.g.,  a  modern  GPU  can  evaluate  a  cryptographic  hash  function  like  SHA2 
about  250  million  times  per  second  [165])  and  as  more  and  more  training  data  (e.g., 
leaked  passwords  from  prior  breaches)  becomes  available  [87],  Adversaries  have 
been  able  to  compromise  servers  at  large  companies  (e.g.,  Zappos,  Linkedln,  Sony, 
Gawker  [5,  9,  10,  11,  13,  28])  resulting  in  the  release  of  millions  of  cryptographic 
password  hashes5.  Symantec  reported  that  compromised  passwords  have  signif¬ 
icant  economic  value  to  an  adversary  (e.g.,  compromised  passwords  are  sold  on 
the  black  market  for  between  $4  and  $30  each)  [79]. 

Because  cryptographic  hash  functions  like  SHA1,  SHA2  and  MD5  were  de¬ 
signed  for  fast  hardware  computation  they  are  poor  choices  for  a  password  hash 
function,  as  they  allow  an  offline  adversary  to  evaluate  millions  of  password 
guesses  per  second.  One  simple  way  that  an  organization  can  mitigate  the  threat 
of  offline  attacks  is  by  using  a  hash  function  like  BCRYPT  [122]  which  is  intention¬ 
ally  designed  to  be  slow  to  compute.  The  BCRYPT  hash  function  takes  a  parameter 
which  allows  the  programmer  to  specify  how  costly  the  hash  computation  should 
be.  The  downside  to  this  approach  is  that  it  also  increases  costs  for  the  company 
that  stores  the  passwords  (e.g.,  if  we  want  it  to  cost  the  adversary  $1,000  for  every 
million  guesses  then  it  will  also  cost  the  company  at  least  $1,000  for  every  million 
login  attempts).  Another  practical  way  for  an  organization  to  help  defend  against 
offline  attacks  is  to  adopt  the  practice  of  password  salting  (e.g.,  instead  of  storing 
the  cryptographic  hash  of  the  password  H (pw)  the  server  stores  (H  (pw,  r ) ,  r)  for  a 

5In  a  few  of  these  cases  [5, 10]  the  passwords  were  stored  in  the  clear. 
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random  string  r  [16]),  which  can  help  mitigate  the  threat  of  offline  attacks6. 

However,  even  these  defenses  will  fail  to  protect  many  users  against  offline 
attacks.  It  has  been  repeatedly  demonstrated  that  users  tend  to  select  easily  guess- 
able  passwords  [39,  66, 92],  and  password  crackers  are  able  to  quickly  break  many 
of  these  passwords[136]. 


1.7.2  GOTCHAs 

In  Chapter  6  we  introduce  GOTCHAs  (Generating  panOptic  Turing  Tests  to  Tell 
Computers  and  Humans  Apart)  as  a  way  of  preventing  automated  offline  dictio¬ 
nary  attacks  against  user  selected  passwords.  A  GOTCHA  is  a  randomized  puzzle 
generation  protocol,  which  involves  interaction  between  a  computer  and  a  human. 
Informally,  a  GOTCHA  should  satisfy  two  key  properties:  (1)  The  puzzles  are  easy 
for  the  human  to  solve.  (2)  The  puzzles  are  hard  for  a  computer  to  solve  even  if  it 
has  the  random  bits  used  by  the  computer  to  generate  the  final  puzzle  —  unlike 
a  CAPTCHA  [152],  Our  main  theorem  demonstrates  that  GOTCHAs  can  be  used 
to  mitigate  the  threat  of  offline  dictionary  attacks  against  passwords  by  ensuring 
that  a  password  cracker  must  receive  constant  feedback  from  a  human  being  while 
mounting  an  attack.  Finally,  we  provide  a  candidate  construction  of  GOTCHAs 
based  on  inkblot  images.  This  construction  relies  on  the  usability  assumption  that 
users  can  recognize  the  phrases  that  they  originally  used  to  describe  each  inkblot 
image  —  a  much  weaker  usability  assumption  than  previous  password  systems 
based  on  inkblots  which  required  users  to  recall  their  phrase  exactly  [147],  We 
conducted  a  user  study  to  evaluate  the  usability  of  our  GOTCHA  construction 
and  generated  a  GOTCHA  challenge  where  artificial  intelligence  and  security  re¬ 
searchers  are  encouraged  to  try  to  crack  several  passwords  protected  with  our 
scheme. 


1.7.3  Overview 

Creating  an  account.  To  create  an  account  in  our  scheme  the  user  first  selects  a 
username  u  and  a  password  pw,  and  sends  (u,  pzv)  to  the  server.  After  verifying 
that  the  username  u  is  available  and  that  pzv  is  permitted  under  the  password 

6Rainbow  tables,  which  consist  of  precomputed  hashes,  are  often  used  by  an  adversary  to 
significantly  speed  up  a  password  cracking  attack  because  the  same  table  can  be  reused  to  attack 
each  user  when  the  passwords  are  unsalted  [117]. 
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Password 


Account 


Generate  Puzzles 


Check  Password 


Example  (pw  is  123456) 


Figure  1.12:  GOTCHA  Authentication  Example 


composition  policy  the  server  generates  ten  random  Inkblot  images  I\, ,  ho  using 
( u ,  pw)  as  a  random  seed  and  sends  these  images  to  the  user.  The  user  responds  by 
sending  back  ten  labels  Ao  (one  for  each  Inkblot  image).  The  server  stores 

these  labels  in  a  random  order. 


Authenticating.  To  authenticate  the  user  sends  his  username  and  password  to 
the  server  (u,pw).  The  server  responds  by  regenerating  the  ten  random  Inkblot 
images  h,  ■  ■  ■ ,  ho  using  ( u , pw)  as  a  random  seed  and  sends  these  images  to  the  user 
along  with  the  labels  t\, ...  ,t\o  (in  a  random  order).  The  user  matches  each  label 
A  with  the  appropriate  Inkblot  image.  Figure  1.12  illustrates  this  process.  The 
server  authenticates  the  user  if  the  password  is  correct  and  all  (most)  of  the  Inkblot 
images  are  matched  correctly.  We  stress  that  if  the  user's  password  is  incorrect 
(e.g.,  if  the  user  sends  ( u,pw ')  where  pw  ±  pw')  then  the  user  will  see  different 
Inkblot  images  I',  ■  ■■  ,I'W  (h  i1  l\  for  each  i  <  10).  The  labels  t\, ... ,  ho  will  be  the 
same  in  either  case. 


Server.  After  the  user  labels  each  of  his  Inkblots  the  server  selects  a  random  per¬ 
mutation  7i  :  [10]  — >  [10]  and  a  random  salt  value  s  G  [0,1]*  and  stores  the  tuple: 


27 


( 'u ,  s,  ln(\), . .. ,  ^ti(io)/  H  (u,  s,  pw,  ri)j.  To  authenticate  the  user  the  server  verifies  that 
H  (u,  s,  pw,  Tt)  —  H  ( u ,  s,  pw',  n'),  where  pw'  and  n'  are  the  password  and  permuta¬ 
tion  provided  by  the  user  during  authentication.  We  stress  that  the  server  does  not 
store  the  Inkblot  images  h, ... ,  ho  or  the  random  permutation  n  so  an  adversary 
would  need  to  simultaneously  guess  pw  and  n  to  crack  the  user's  password  in  an 
offline  attack. 
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Chapter  2 

Naturally  Rehearsing  Passwords 


2.1  Introduction 

A  typical  computer  user  today  manages  passwords  for  many  different  online 
accounts.  Users  struggle  with  this  task — often  forgetting  their  passwords  or 
adopting  insecure  practices,  such  as  using  the  same  password  for  multiple  ac¬ 
counts  and  selecting  weak  passwords  [39,  52,  75, 102],  While  there  are  many  arti¬ 
cles,  books,  papers  and  even  comics  about  selecting  strong  individual  passwords 
[4,  46,  49,  82,  109,  132,  146,  162],  there  is  very  little  work  on  password  manage¬ 
ment  schemes — systematic  strategies  to  help  users  create  and  remember  multiple 
passwords — that  are  both  usable  and  secure.  In  this  chapter,  we  present  a  rig¬ 
orous  treatment  of  password  management  schemes.  Our  contributions  include 
a  formalization  of  important  aspects  of  a  usable  scheme,  a  quantitative  security 
model,  and  a  construction  that  provably  achieves  the  competing  security  and 
usability  properties. 


Usability  Challenge.  We  consider  a  setting  where  a  user  has  two  types  of  mem¬ 
ory:  persistent  memory  (e.g.,  a  sticky  note  or  a  text  file  on  his  computer)  and 
associative  memory  (e.g.,  his  own  human  memory).  We  assume  that  persistent 
memory  is  reliable  and  convenient  but  not  private  (i.e.,  accessible  to  an  adver¬ 
sary).  In  contrast,  a  user's  associative  memory  is  private  but  lossy — if  the  user 
does  not  rehearse  a  memory  it  may  be  forgotten.  While  our  understanding  of 
human  memory  is  incomplete,  it  has  been  an  active  area  of  research  [23]  and  there 
are  many  mathematical  models  of  human  memory  [19,  100,  106,  150,  157],  These 
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models  differ  in  many  details,  but  they  all  model  an  associative  memory  with 
cue-association  pairs:  to  remember  a  (e.g.,  a  password)  the  brain  associates  the 
memory  with  a  context  c  (e.g.,  a  public  hint  or  cue);  such  associations  are  strength¬ 
ened  by  rehearsal.  A  central  challenge  in  designing  usable  password  schemes  is 
thus  to  create  associations  that  are  strong  and  to  maintain  them  over  time  through 
rehearsal.  Ideally,  we  would  like  the  rehearsals  to  be  natural,  i.e.,  they  should  be 
a  side-effect  of  users'  normal  online  activity.  Indeed  insecure  password  manage¬ 
ment  practices  adopted  by  users,  such  as  reusing  passwords,  improve  usability 
by  increasing  the  number  of  times  a  password  is  naturally  rehearsed  as  users  visit 
their  online  accounts. 


Security  Challenge.  Secure  password  management  is  not  merely  a  theoretical 
problem — there  are  numerous  real-world  examples  of  password  breaches  [3,  6,  8, 
9,  10, 11, 12, 13,  28,  52, 141].  Adversaries  may  crack  a  weak  password  in  an  online 
attack  where  they  simply  visit  the  online  account  and  try  as  many  guesses  as  the  site 
permits.  In  many  cases  (e.g.,  Zappos,  Linkedln,  Sony,  Gawker  [6,  8,  9, 11, 13,  28]) 
an  adversary  is  able  to  mount  an  offline  attack  to  crack  weak  passwords  after 
the  cryptographic  hash  of  a  password  is  leaked  or  stolen.  To  protect  against 
an  offline  attack,  users  are  often  advised  to  pick  long  passwords  that  include 
numbers,  special  characters  and  capital  letters  [132].  In  other  cases  even  the 
strongest  passwords  are  compromised  via  a  plaintext  password  leak  attack  (e.g., 
[5, 10, 12, 141]),  for  example,  because  the  user  fell  prey  to  a  phishing  attack  or  signed 
into  his  account  on  an  infected  computer  or  because  of  server  misconfigurations. 
Consequently,  users  are  typically  advised  against  reusing  the  same  password. 
A  secure  password  management  scheme  must  protect  against  all  these  types  of 
breaches. 


Contributions.  We  precisely  define  the  password  management  problem  in  Sec¬ 
tion  2.3.  A  password  management  scheme  consists  of  a  generator — a  function  that 
outputs  a  set  of  public  cue-password  pairs — and  a  rehearsal  schedule.  The  generator 
is  implemented  using  a  computer  program  whereas  the  human  user  is  expected 
to  follow  the  rehearsal  schedule  for  each  cue.  This  division  of  work  is  critical — the 
computer  program  performs  tasks  that  are  difficult  for  human  users  (e.g.,  gener¬ 
ating  random  bits)  whereas  the  human  user's  associative  memory  is  used  to  store 
passwords  since  the  computer's  persistent  memory  is  accessible  to  the  adversary. 

Quantifying  Usability.  In  the  same  way  that  security  proofs  in  cryptography  are 
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based  on  complexity-theoretic  assumptions  (e.g.,  hardness  of  factoring  and  dis¬ 
crete  logarithm),  we  quantify  usability  by  introducing  usability  assumptions.  In 
particular,  password  management  relies  on  assumptions  about  human  memory, 
e.g.,  that  a  user  who  follows  a  particular  rehearsal  schedule  will  successfully  main¬ 
tain  the  corresponding  memory.  These  assumptions  are  informed  by  research  in 
cognitive  science  and  can  be  tested  empirically.  Given  rehearsal  requirements  and 
a  user's  visitation  schedule  for  each  account,  we  use  the  total  number  of  extra 
rehearsals  that  the  user  would  have  to  do  to  remember  all  of  his  passwords  as  a 
measure  of  the  usability  of  the  password  scheme  (Section  2.4).  Specifically,  in  our 
usability  analysis,  we  use  the  Expanding  Rehearsal  Assumption  (ER)  that  allows  for 
memories  to  be  rehearsed  with  exponentially  decreasing  frequency,  i.e.,  rehearse 
at  least  once  in  the  time-intervals  (days)  [1, 2),  [2, 4),  [4, 8)  and  so  on.  Few  long-term 
memory  experiments  have  been  conducted,  but  ER  is  consistent  with  known  stud¬ 
ies  [144, 160].  Our  memory  assumptions  are  parameterized  by  a  constant  a  which 
represents  the  strength  of  the  mnemonic  devices  used  to  memorize  and  rehearse 
a  cue-association  pair.  Strong  mnemonic  techniques  [78, 143]  exploit  the  associa¬ 
tive  nature  of  human  memory  discussed  earlier  and  its  remarkable  visual/spatial 
capacity  [145]. 

Quantifying  Security.  We  present  a  game  based  security  model  for  a  password  man¬ 
agement  scheme  (Section  2.5)  in  the  style  of  exact  security  definitions  [26].  The 
game  is  played  between  a  user  Qli)  and  a  resource-bounded  adversary  {fyE)  whose 
goal  is  to  guess  one  of  the  user's  passwords.  Our  game  models  three  commonly 
occurring  breaches  (online  attack,  offline  attack,  plaintext  password  leak  attack). 

Our  Construction.  We  present  a  new  password  management  scheme,  which  we 
call  Shared  Cues,  and  prove  that  it  provides  strong  security  and  usability  prop¬ 
erties  (see  Section  2.6).  Our  scheme  incorporates  powerful  mnemonic  techniques 
through  the  use  of  public  cues  (e.g.,  photos)  to  create  strong  associations.  The  user 
first  associates  a  randomly  generated  person-action-object  story  (e.g..  Bill  Gates 
swallowing  a  bike)  with  each  public  cue.  We  use  the  Chinese  Remainder  Theorem 
to  share  cues  across  sites  in  a  way  that  balances  several  competing  security  and 
usability  goals:  1)  Each  cue-association  pair  is  used  by  many  different  web  sites 
(so  that  most  rehearsal  requirements  are  satisfied  naturally),  2)  the  total  number 
of  cue-association  pairs  that  the  user  has  to  memorize  is  low,  3)  each  web  site  uses 
several  cue-association  pairs  (so  that  passwords  are  secure)  and  4)  no  two  web 
sites  share  too  many  cues  (so  that  passwords  remain  secure  even  after  the  adver- 
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sary  obtains  some  of  the  user's  other  passwords).  We  show  that  our  construction 
achieves  an  asymptotically  optimal  balance  between  these  security  and  usability 
goals  (Lemma  2,  Theorem  3). 


2.2  Related  Work. 

A  distinctive  goal  of  our  work  is  to  quantify  the  usability  of  password  man¬ 
agement  schemes  by  drawing  on  ideas  from  cognitive  science  and  leverage  this 
understanding  to  design  schemes  with  acceptable  usability  We  view  the  results 
of  this  paper-employing  usability  assumptions  about  rehearsal  requirements — as 
an  initial  step  towards  this  goal.  While  the  mathematical  constructions  start  from 
the  usability  assumptions,  the  assumptions  themselves  are  empirically  testable, 
e.g.,  via  longitudinal  user  studies.  In  contrast,  a  line  of  prior  work  on  usability  has 
focused  on  empirical  studies  of  user  behavior  including  their  password  manage¬ 
ment  habits  [52,  75, 102],  the  effects  of  password  composition  rules  (e.g.,  requiring 
numbers  and  special  symbols)  on  individual  passwords  [34, 101],  the  memorabil¬ 
ity  of  individual  system  assigned  passwords  [140],  graphical  passwords  [27,  48], 
and  passwords  based  on  implicit  learning  [38].  These  user  studies  have  been  lim¬ 
ited  in  duration  and  scope  (e.g.,  study  retention  of  a  single  password  over  a  short 
period  of  time).  Other  work  [43]  articulates  informal,  but  more  comprehensive, 
usability  criteria  for  password  schemes. 

Our  use  of  cued  recall  is  driven  by  evidence  that  it  is  much  easier  than  pure  recall 
[23].  We  also  exploit  the  large  human  capacity  for  visual  memory  [145]  by  using 
pictures  as  cues.  Prior  work  on  graphical  passwords  [27,  48]  also  takes  advantage 
of  these  features.  However,  our  work  is  distinct  from  the  literature  on  graphical 
passwords  because  we  address  the  challenge  of  managing  multiple  passwords. 
More  generally,  usable  and  secure  password  management  is  an  excellent  problem 
to  explore  deeper  connections  between  cryptography  and  cognitive  science. 

Security  metrics  for  passwords  like  (partial)  guessing  entropy  (e.g.,  how  many 
guesses  does  the  adversary  need  to  crack  ^-fraction  of  the  passwords  in  a  dataset 
[39,  107,  121]?  how  many  passwords  can  the  adversary  break  with  /3  guesses  per 
account  [45]?)  were  designed  to  analyze  the  security  of  a  dataset  of  passwords 
from  many  users,  not  the  security  of  a  particular  user's  password  management 
scheme.  While  these  metrics  can  provide  useful  feedback  about  individual  pass¬ 
words  (e.g.,  they  rule  out  some  insecure  passwords)  they  do  not  deal  with  the 
complexities  of  securing  multiple  accounts  against  an  adversary  who  may  have 
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gained  background  knowledge  about  the  user  from  previous  attacks  —  we  refer 
an  interested  reader  to  Appendix  7.7  for  more  discussion. 


Biometrics.  Biometric  factors  like  fingerprints  and  voice  recognition  have  been 
proposed  as  an  alternative  to  passwords.  For  example,  the  user's  computer  might 
record  features  from  the  user's  fingerprint  (a  biometric  template)  and  compare 
them  with  the  features  extracted  later  when  the  user  tries  to  authenticate  —  au¬ 
thentication  is  successful  if  these  biometric  templates  are  'close  enough'.  While 
biometrics  do  offer  a  usability  advantage  (e.g.,  there  is  usually  nothing  for  the 
user  to  remember)  there  are  many  drawbacks:  they  require  additional  hardware 
support  and  biometric  templates  are  difficult  (or  impossible)  to  change  if  they  are 
compromised.  Another  security  disadvantage  is  that  biometric  templates  often 
have  low  entropy.  For  example,  O'Gorman  estimated  that  biometric  templates 
based  on  fingerprints,  iris  scans  and  voice  recognition  contain  just  13.3,  19.9  and 
11.7  bits  of  entropy  respectively [118].  Storing  biometric  templates  is  also  a  chal¬ 
lenging  research  problem  because  biometric  templates  are  not  matched  exactly 
like  passwords,  but  based  on  the  closeness  of  the  two  signals.  There  has  been 
some  work  in  the  cryptographic  community  on  developing  secure  sketches  or  fuzzy 
extractors[65, 104],  A  secure  sketch  is  a  function  that  extracts  a  stable  signal,  which 
could  be  encrypted,  from  a  noisy  signal  with  high  minimum  entropy.  If  two  noisy 
signals  are  'close'  then  the  secure  sketch  will  extract  the  same  stable  signal  from 
both  noisy  signals.  However,  if  the  noisy  signal  has  low  minimum  entropy  like 
most  biometric  templates  then  the  stable  signal  we  extract  might  not  be  random  at 
all  because  these  techniques  entail  a  small  loss  in  entropy.  Consequently,  biometric 
templates  are  often  stored  in  the  clear  on  the  authentication  server,  which  means 
that  an  adversary  who  breaches  the  server  will  learn  the  user's  biometric  template 
directly. 


Password  Managers.  A  password  manager  is  a  computer  program  that  uses 
an  initial  password  (often  called  a  master  password)  to  generate  password(s)  for 
the  user.  For  example,  if  the  user  wanted  to  generate  a  password  for  a  domain 
D  using  the  initial  password  pwu  then  the  password  manager  PwdHash[130] 
would  generate  the  password  H  ( D,pwDY .  Even  if  the  user  reuses  the  same  initial 

1  Other  password  managers  like  IPassivord  and  LastPass  use  a  master  password  to  encrypt  a 
database  of  passwords,  which  could  be  stored  in  untrusted  memory  (e.g.,  USB  sticks,  the  cloud). 
Gasti  and  Rasmussen  showed  that  most  of  these  password  managers  were  vulnerable  to  attacks 
by  adversary  who  could  read  the  encrypted  database  -  even  in  the  user's  master  password  was 
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password  pwu  for  a  different  domain  D'  the  final  password  H  ( D ',  pwD )  will  still 
be  different.  Password  mangers  like  PwdHash[130]  do  provide  users  with  several 
security  advantages:  the  user  can  be  sure  that  his  initial  password  is  always 
properly  hashed  and  encrypted  before  it  is  sent  to  the  authentication  server2  and  an 
adversary  who  observes  the  password  H  ( D ',  pw)  for  a  domain  D'  will  not  be  able  to 
guess  the  password  H  (D,  pvo)  for  another  domain  D  unless  he  can  break  the  master 
password  pvo.  However,  password  reuse  is  still  a  security  problem  even  if  the  user 
adopts  a  password  manager.  If  a  user  selects  one  master  password  to  generate 
all  of  his  passwords  then  an  adversary  who  obtains  this  'master  password'  would 
be  able  to  compromise  all  of  the  user's  accounts.  This  master  password  could 
be  a  tempting  target  for  an  adversary  who  is  looking  to  maximize  the  return  on 
investment  of  his  attack.  An  adversary  who  has  obtained  the  cryptographic  hash 
of  the  user's  final  password  for  a  domain  D  would  still  be  able  to  execute  an  offline 
attack  against  the  user's  initial  password  (e.g.,  by  applying  the  PwdHash  function 
as  an  extra  step  to  verify  each  guess).  Even  if  the  master  password  is  strong  enough 
to  resist  offline  attacks  the  master  password  could  still  be  exposed  whenever  the 
user  types  it  in  to  generate  one  of  his  passwords3.  A  user  study  conducted  by 
Chiasson  et  al.  indicated  that  the  master  passwords  of  many  PwdHash  users  may 
still  be  vulnerable  to  phishing  attacks  because  of  confusing  user  interfaces [54], 
Unless  the  user  can  be  sure  that  every  device  (e.g.,  laptop,  smartphone,  friend's 
computer,  public  computer)  he  ever  uses  to  login  is  malware  free  and  that  there 
are  no  'hidden  cameras'  at  any  location  (e.g.,  library,  home,  friend's  house,  office, 
coffee  shop)  from  which  the  user  logs  into  an  account  then  the  user's  master 
password  may  be  vulnerable.  We  stress  that  the  password  management  schemes 
we  propose  could  be  used  in  conjunction  with  a  password  manager  like  PwdHash 
(e.g.,  instead  of  using  one  master  password  for  all  of  his  accounts  the  user  would 
create  a  different  initial  password  for  each  domain  D  by  following  our  password 
management  scheme).  In  this  case  the  user  would  get  all  of  the  security  benefits 
of  a  password  manager  (e.g.,  by  ensuring  that  passwords  are  properly  encrypted 
and  hashed  before  they  are  sent  to  a  server)  without  the  single-point  of  failure 
problem. 


strong[81]. 

2Unfortunately,  some  sites  do  not  always  properly  hash  the  passwords  stored  on  their  servers[5, 
13, 40].  Other  sites  do  not  properly  encrypt  their  users'  passwords  before  they  are  transmitted  over 
the  Internet[40]. 

3While  this  is  a  concern  with  or  without  a  password  manager,  the  damage  of  a  plaintext 
password  leak  attack  is  potentially  much  greater  if  the  user  only  has  one  master  password. 
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Combinatorial  Designs.  Our  notion  of  (n,£,y)- sharing  set  families  (definition 
5)  is  equivalent  to  Nisan  and  Wigderson's  definition  of  a  (k,  m)-design  [115].  The 
problem  of  finding  maximally  sized  ( n ,  £,  y)-sharing  set  families  was  considered 
at  least  as  early  as  1956  by  Paul  Erdos  and  Alfred  Renyi  [69],  and  applications  of 
some  of  these  families  may  have  been  considered  by  Euler  [71].  Erdos  explored 
properties  of  these  families  several  times  [68]  [70],  and  Rodl  built  on  his  work  [127], 
(n,£,y)~, sharing  set  families  were  rediscovered  by  Nisan  and  Wigderson  [115], 
who  used  them  to  design  a  pseudorandom  number  generator.  Trevisan  showed 
how  to  use  (nr£ry)~. sharing  set  families  to  construct  pseudorandom  extractors 
[149].  Because  Nisan  and  Wigderson  were  focused  on  a  different  application 
(constructing  pseudorandom  bit  generators)  the  range  of  parameters  that  they 
consider  are  not  suitable  for  our  password  setting  in  which  £  and  y  are  constants. 
In  Appendix  7.7.2  we  show  that  our  construction  of  ( n ,  £,  y)-sharing  set  families 
has  interesting  applications  in  the  construction  of  parallel  pseudorandom  number 
generators.  See  Appendix  7.7  for  more  discussion  of  ( n ,  £,  y)-sharing  set  families. 


2.3  Definitions 

We  use  P  to  denote  the  space  of  possible  passwords.  A  password  management 
scheme  needs  to  generate  m  passwords  pi,  ...,pm  e  P  —  one  for  each  account  A,. 

2.3.1  Associative  Memory  and  Cue-Association  Pairs 

Human  memory  is  associative.  Competitors  in  memory  competitions  routinely 
use  mnemonic  techniques  (e.g.,  the  method  of  loci  [143])  which  exploit  associative 
memory [78].  For  example,  to  remember  the  word  'apple'  a  competitor  might 
imagine  a  giant  apple  on  the  floor  in  his  bedroom.  The  bedroom  now  provides 
a  context  which  can  later  be  used  as  a  cue  to  help  the  competitor  remember  the 
word  apple.  We  use  c  £  C  to  denote  the  cue,  and  we  use  a  e  j?[S  to  denote  the 
corresponding  association  in  a  cue-association  pair  (c,  a).  Physically,  c  (resp.  a) 
might  encode  the  excitement  levels  of  the  neurons  in  the  user's  brain  when  he 
thinks  about  his  bedroom  (resp.  apples)  [106]. 

We  allow  the  password  management  scheme  to  store  m  sets  of  public  cues 
Ci,  C  C  in  persistent  memory  to  help  the  user  remember  each  password. 
Because  these  cues  are  stored  in  persistent  memory  they  are  always  available  to 
the  adversary  as  well  as  the  user.  Notice  that  a  password  may  be  derived  from 
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multiple  cue-association  pairs.  We  use  c  €  C  to  denote  a  cue,  c  c  C  to  denote  a 
set  of  cues,  and  C  =  IJ;=i  ci  to  denote  the  set  of  all  cues  —  n  -  |C|  denotes  the  total 
number  of  cue-association  pairs  that  the  user  has  to  remember. 

2.3.2  Visitation  Schedules  and  Rehearsal  Requirements 

Each  cue  c  e  C  may  have  a  rehearsal  schedule  to  ensure  that  the  cue-association 
pair  (c,  a )  is  maintained. 

Definition  1.  A  rehearsal  schedule  for  a  cue-association  pair  ( c,d )  is  a  sequence  of  times 
fg  <  t\  <  ....  For  each  i  >  0  we  have  a  rehearsal  requirement,  the  cue-association  pair 

must  be  rehearsed  at  least  once  during  the  time  window  [b,  E+1)  =  )xelR  F  <  x  <  tei+1}. 

A  rehearsal  schedule  is  sufficient  if  a  user  can  maintain  the  association  ( c,d )  by 
following  the  rehearsal  schedule.  We  discuss  sufficient  rehearsal  assumptions  in 
Section  2.4.  The  length  of  each  interval  fb,  F+j )  may  depend  on  the  strength  of  the 
mnemonic  technique  used  to  memorize  and  rehearse  a  cue-association  pair  (c,  a) 
as  well  as  i  —  the  number  of  prior  rehearsals.  For  notational  convenience,  we  use 
a  function  R  :  C  x  N  — >  1R  to  specify  the  rehearsal  requirements  (e.g.,  R  (c,  j)  =  F.), 
and  we  use  *R  to  denote  a  set  of  rehearsal  functions. 

A  visitation  schedule  for  an  account  A,  is  a  sequence  of  real  numbers  zl0  <  t'  < 
. . .,  which  represent  the  times  when  the  account  A,  is  visited  by  the  user.  We  do  not 
assume  that  the  exact  visitation  schedules  are  known  a  priori.  Instead  we  model 
visitation  schedules  using  a  random  process  with  a  known  parameter  A,  based 
on  E  |t'.+1  -  t'.J  —  the  average  time  between  consecutive  visits  to  account  A,-.  A 
rehearsal  requirement  Itf  tf  ( j  can  be  satisfied  naturally  if  the  user  visits  a  site  A; 
that  uses  the  cue  c  (c  e  Cjj  during  the  given  time  window.  Formally, 

Definition  2.  We  say  that  a  rehearsal  requirement  [V:,  b+|)  is  naturally  satisfied  by  a 
visitation  schedule  <  . . .  if3j  e  [m\,k  eN  s.t  c  e  Cj  and  t£  e  \tf  h+1  j.  We  use 

XRt/£  =  [i  |  ff+1  <  t  A  V;,  k.  (c  <£  c,  V  t[  $  [ff,  *f+i))}|  / 

to  denote  the  number  of  rehearsal  requirements  that  are  not  naturally  satisfied  by  the 
visitation  schedule  during  the  time  interval  [0,  t]. 
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We  use  rehearsal  requirements  and  visitation  schedules  to  quantify  the  usabil¬ 
ity  of  a  password  management  scheme  by  measuring  the  total  number  of  extra 
rehearsals.  If  a  cue-association  pair  (c,a)  is  not  rehearsed  naturally  during  the 
interval  Itf  then  the  user  needs  to  perform  an  extra  rehearsal  to  maintain  the 
association.  Intuitively,  XRt £  denotes  the  total  number  of  extra  rehearsals  of  the 
cue-association  pair  ( c,d )  during  the  time  interval  [0,  f].  We  use  XRt  =  YjcecXRt,c 
to  denote  the  total  number  of  extra  rehearsals  during  the  time  interval  [0,  f]  to 
maintain  all  of  the  cue-association  pairs. 

Usability  Goal:  Minimize  IE  [XRf]. 


2.3.3  Password  Management  Scheme 

A  password  management  scheme  includes  a  generator  Q,„  and  a  rehearsal  schedule 
R  £  R.  The  generator  Qm  (k,  b,  A,  R^j  utilizes  a  user's  knowledge  k  e  7C,  random  bits 
b  £  {0,1}*  to  generate  passwords  pi,...,p„,  and  public  cues  Ci,...,c,„  c  C.  Qm  may  use 
the  rehearsal  schedule  R  and  the  visitation  schedules  A  =  (Ai, ...,  Am)  of  each  site 
to  help  minimize  E  [XRt].  Because  the  cues  C\,  ...cm  are  public  they  may  be  stored 
in  persistent  memory  along  with  the  code  for  the  generator  Qm.  In  contrast,  the 
passwords  p\,  ...pm  must  be  memorized  and  rehearsed  by  the  user  (following  R)  so 
that  the  cue  association  pairs  (c„  pi)  are  maintained  in  his  associative  memory. 

Definition  3.  A  password  management  scheme  is  a  tuple  (Qm,  R),  where  Qm  is  a  function 
Qm  :  W  x  {0, 1}*  x  Rm  x  (R  — »  ifP  x  2 c)  and  a  R  e  (R  is  a  rehearsal  schedule  which  the 
user  must  follow  for  each  cue. 

Our  security  analysis  is  not  based  on  the  secrecy  of  Qm,  k  or  the  public  cues 
C  =  U!=i  Ci.  The  adversary  will  be  able  to  find  the  cues  C\,  ...,cm  because  they 
are  stored  in  persistent  memory.  In  fact,  we  also  assume  that  the  adversary  has 
background  knowledge  about  the  user  (e.g.,  he  may  know  k),  and  that  the  adver¬ 
sary  knows  the  password  management  scheme  Qm.  The  only  secret  is  the  random 
string  b  used  by  Qm  to  produce  p\,  ...,pm. 

Example  Password  Management  Schemes.  Most  password  suggestions  are  too 
vague  (e.g./'pick  an  obscure  phrase  that  is  personally  meaningful  to  you")  to  sat¬ 
isfy  the  precise  requirements  of  a  password  management  scheme  —  formal  security 
proofs  of  protocols  involving  human  interaction  can  break  down  when  humans 
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behave  in  unexpected  ways  due  to  vague  instructions  [123].  We  consider  the  fol¬ 
lowing  formalization  of  password  management  schemes:  (1)  Reuse  Weak  —  the 
user  selects  a  random  dictionary  word  w  (e.g.,  from  a  dictionary  of  20, 000  words) 
and  uses  pi  =  zv  as  the  password  for  every  account  A;.  (2)  Reuse  Strong  —  the  user 
selects  four  random  dictionary  words  (wq,  w2/  zv3r  w4)  and  uses  p,  =  zviZV2zv3zv4l  as 
the  password  for  every  account  A,-.  (3)  Lifehacker  (e.g.,  [4])  —  The  user  selects  three 
random  words  (zv  1,  zv2,  zu3)  from  the  dictionary  as  a  base  password  b  =  zi>izv2zv3.  The 
user  also  selects  a  random  derivation  rule  d  to  derive  a  string  from  each  account 
name  (e.g.,  use  the  first  three  letters  of  the  account  name,  use  the  first  three  vowels 
in  the  account  name).  The  password  for  account  A,  is  p,  =  bd  (A,-)  where  d  (A,) 
denotes  the  derived  string.  (4)  Strong  Random  and  Independent  —  for  each  account 
A,  the  user  selects  four  fresh  words  independently  at  random  from  the  dictionary 
and  uses  p,  =  zv'pw^zv^zv^.  Schemes  (l)-(3)  are  formalizations  of  popular  password 
management  strategies.  We  argue  that  they  are  popular  because  they  are  easy  to 
use,  while  the  strongly  secure  scheme  Strong  Random  and  Independent  is  unpopular 
because  the  user  must  spend  a  lot  of  extra  time  rehearsing  his  passwords.  See 
Appendix  7.6  for  more  discussion  of  the  security  and  usability  of  each  scheme. 


2.4  Usability  Model 


People  typically  adopt  their  password  management  scheme  based  on  usability 
considerations  instead  of  security  considerations  [75].  Our  usability  model  can  be 
used  to  explain  why  users  tend  to  adopt  insecure  password  management  schemes 
like  Reuse  Weak,  Lifehacker,  or  Reuse  Strong.  Our  usability  metric  measures  the 
extra  effort  that  a  user  has  to  spend  rehearsing  his  passwords.  Our  measurement 
depends  on  three  important  factors:  rehearsal  requirements  for  each  cue,  visitation 
rates  for  each  site,  and  the  total  number  of  cues  that  the  user  needs  to  maintain. 
Our  main  technical  result  in  this  section  is  Theorem  1  —  a  formula  to  compute 
the  total  number  of  extra  rehearsals  that  a  user  has  to  do  to  maintain  all  of  his 
passwords  for  t  days.  To  evaluate  the  formula  we  need  to  know  the  rehearsal 
requirements  for  each  cue-association  pair  as  well  as  the  visitation  frequency  A, 
for  each  account  A,. 
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2.4.1  Rehearsal  Requirements 


If  the  password  management  scheme  does  not  mandate  sufficient  rehearsal  then 
the  user  might  forget  his  passwords.  Few  memory  studies  have  attempted  to  study 
memory  retention  over  long  periods  of  time  so  we  do  not  know  exactly  what  these 
rehearsal  constraints  should  look  like.  While  security  proofs  in  cryptography  are 
based  on  assumptions  from  complexity  theory  (e.g.,  hardness  of  factoring  and  dis¬ 
crete  logarithm),  we  need  to  make  assumptions  about  humans.  For  example,  the 
assumption  behind  CAPTCHAs  is  that  humans  are  able  to  perform  a  simple  task 
like  reading  garbled  text  [152],  A  rehearsal  assumption  specifies  what  types  of  re¬ 
hearsal  constraints  are  sufficient  to  maintain  a  memory.  We  consider  two  different 
assumptions  about  sufficient  rehearsal  schedules:  Constant  Rehearsal  Assump¬ 
tion  (CR)  and  Expanding  Rehearsal  Assumption  (ER).  Because  some  mnemonic 
devices  are  more  effective  than  others  (e.g.,  many  people  have  amazing  visual  and 
spatial  memories  [145])  our  assumptions  are  parameterized  by  a  constant  o  which 
represents  the  strength  of  the  mnemonic  devices  used  to  memorize  and  rehearse 
a  cue  association  pair. 

Constant  Rehearsal  Assumption  (CR):  The  rehearsal  schedule  given  by  R  (c,  i )  = 
io  is  sufficient  to  maintain  the  association  (c,a). 

CR  is  a  pessimistic  assumption  —  it  asserts  that  memories  are  not  permanently 
strengthened  by  rehearsal.  The  user  must  continue  rehearsing  every  a  days  — 
even  if  the  user  has  frequently  rehearsed  the  password  in  the  past. 

Expanding  Rehearsal  Assumption  (ER):  The  rehearsal  schedule  given  by 
R  (c,  i)  =  2ia  is  sufficient  to  maintain  the  association  (c,  a). 

ER  is  more  optimistic  than  CR  —  it  asserts  that  memories  are  strengthened  by 
rehearsal  so  that  memories  need  to  be  rehearsed  less  and  less  frequently  as  time 
passes.  If  a  password  has  already  been  rehearsed  i  times  then  the  user  does  not 
have  to  rehearse  again  for  2la  days  to  satisfy  the  rehearsal  requirement  |2'a,  2'a+CT). 
ER  is  consistent  with  several  long  term  memory  experiments  [144], [23,  Chapter 
7],  [160]  —  we  refer  the  interested  reader  to  Appendix  7.6  for  more  discussion. 
We  also  consider  the  rehearsal  schedule  R(c,i )  =  i2  (derived  from  [18,  151])  in 
Appendix  7.6  —  the  usability  results  are  almost  indentical  to  those  for  ER. 
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Schedule 

A 

i 

T 

3 

7 

31 

i 

365 

Very  Active 

10 

10 

10 

10 

35 

Typical 

5 

10 

10 

10 

40 

Occasional 

2 

10 

20 

20 

23 

Infrequent 

0 

2 

5 

10 

58 

Table  2.1:  Visitation  Schedules  -  number  of  accounts  visited  with  frequency  A 
(visits/days) 

2.4.2  Visitation  Schedules. 

Visitation  schedules  may  vary  greatly  from  person  to  person.  For  example,  a  2006 
survey  about  Facebook  usage  showed  that  47%  of  users  logged  in  daily,  22.4% 
logged  in  about  twice  a  week,  8.6%  logged  in  about  once  a  week,  and  12%  logged 
in  about  once  a  month[15].  We  use  a  Poisson  arrival  process  with  parameter  A, 
to  model  the  visitation  schedule  for  site  A;.  We  formally  define  a  Poisson  arrival 
process  in  Appendix  7.1  (see  Definition  18).  One  nice  property  of  a  Poisson  arrival 
process  with  parameter  A  is  that  the  value  \  represents  the  average  time  between 
consecutive  arrivals  (see  Fact  4  in  Appendix  10.1).  We  assume  that  the  value  of  1  /  A, 
—  the  average  inter-visitation  time  —  is  known.  For  example,  some  websites  (e.g., 
gmail)  may  be  visited  daily  (A,  =  1/1  day)  while  other  websites  (e.g.,  IRS)  may 
only  be  visited  once  a  year  on  average  (e.g..  A,  =  1/365  days).  The  Poisson  process 
has  been  used  to  model  the  distribution  of  requests  to  a  web  server  [125].  While 
the  Poisson  process  certainly  does  not  perfectly  model  a  user's  visitation  schedule 
(e.g.,  visits  to  the  IRS  websites  may  be  seasonal)  we  believe  that  the  predictions  we 
derive  using  this  model  will  still  be  useful  in  guiding  the  development  of  usable 
password  management  schemes.  While  we  focus  on  the  Poisson  arrival  process, 
our  analysis  could  be  repeated  for  other  random  processes. 

We  consider  four  very  different  types  of  internet  users:  very  active,  typical, 
occasional  and  infrequent.  Each  user  account  A,  may  be  visited  daily  (e.g..  A,  =  1), 
every  three  days  (A,  =  1/3),  every  week  (e.g.  A,  =  1/7),  monthly  (A,-  =  1/31),  or 
yearly  (A,  =  1/365)  on  average.  See  Table  2.1  to  see  the  full  visitation  schedules  we 
define  for  each  type  of  user.  For  example,  our  very  active  user  has  10  accounts  he 
visits  daily  and  35  accounts  he  visits  annually. 


Extra  Rehearsals.  Theorem  1  leads  us  to  our  key  observation:  cue-sharing  ben¬ 
efits  users  both  by  (1)  reducing  the  number  of  cue-association  pairs  that  the  user 
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Assumption 

CR  (0  =  1) 

ER  (a  =  1) 

Schedule/Scheme 

B+D 

SRI 

B+D 

SRI 

Very  Active 

«  0 

23,396 

.023 

420 

Typical 

.014 

24,545 

.084 

456.6 

Occasional 

.05 

24,652 

.12 

502.7 

Infrequent 

56.7 

26,751 

1.2 

564 

Table  2.2:  E  [XR365]:  Extra  Rehearsals  over  the  first  year  for  both  rehearsal  assump¬ 
tions. 

B+D:  Lifehacker 

SRI:  Strong  Random  and  Independent 


has  to  memorize  and  (2)  by  increasing  the  rate  of  natural  rehearsals  for  each 
cue-association  pair.  For  example,  a  active  user  with  75  accounts  would  need  to 
perform  420  extra-rehearsals  over  the  first  year  to  satisfy  the  rehearsal  require¬ 
ments  given  by  ER  if  he  adopts  Strong  Random  and  Independent  or  just  0.023  with 
Lifehacker  —  see  Table  2.2.  The  number  of  unique  cue-association  pairs  n  decreased 
by  a  factor  of  75,  but  the  total  number  of  extra  rehearsals  E[XR365]  decreased  by  a 
factor  of  8, 260.8  ~  75  x  243  due  to  the  increased  natural  rehearsal  rate. 

Theorem  1.  Let  it*  =  (argmaxrf£  <  tj  —  1  then 


E[XR,]  =  XL 

ceC  i= 0 


r  \ 

exp 

- 

(£,  -  (?) 

\ 

(j:cecj  J 

/ 

Theorem  1  follows  easily  from  Lemma  1  and  linearity  of  expectations.  Each 
cue-association  pair  (c,  a)  is  rehearsed  naturally  whenever  the  user  visits  any  site 
which  uses  the  public  cue  c.  Lemma  1  makes  use  of  two  key  properties  of  Poisson 
processes:  (1)  The  natural  rehearsal  schedule  for  a  cue  c  is  itself  a  Poisson  process, 
and  (2)  Independent  Rehearsals  -  the  probability  that  a  rehearsal  constraint  is 
satisfied  is  independent  of  previous  rehearsal  constraints. 

Lemma  1.  Let  St  =  {i  \  c  6  ct\  and  let  At  =  X,i6S  A,  then  the  probability  that  the  cue  c  is 
not  naturally  rehearsed  during  time  interval  [a,  b]  is  exp  (-At  ( b  -  a)). 
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2.5  Security  Model 


In  this  section  we  present  a  game  based  security  model  for  a  password  manage¬ 
ment  scheme.  The  game  is  played  between  a  user  ( H )  and  a  resource  bounded 
adversary  (ffft)  whose  goal  is  to  guess  one  of  the  user's  passwords.  We  demonstrate 
how  to  select  the  parameters  of  the  game  by  estimating  the  adversary's  amortized 
cost  of  guessing.  Our  security  definition  is  in  the  style  of  the  exact  security  defi¬ 
nitions  of  Bellare  and  Rogaway  [26].  Previous  security  metrics  (e.g.,  min-entropy, 
password  strength  meters)  fail  to  model  the  full  complexity  of  the  password  man¬ 
agement  problem  (see  Appendix  7.3  for  more  discussion).  By  contrast,  we  assume 
that  the  adversary  knows  the  user's  password  management  scheme  and  is  able  to 
see  any  public  cues.  Furthermore,  we  assume  that  the  adversary  has  background 
knowledge  (e.g.,  birth  date,  hobbies)  about  the  user  (formally,  the  adversary  is 
given  k  £  (K).  Many  breaches  occur  because  the  user  falsely  assumes  that  certain 
information  is  private  (e.g.,  birth  date,  hobbies,  favorite  movie) [7, 134]. 


Adversary  Attacks.  Before  introducing  our  game  based  security  model  we  con¬ 
sider  the  attacks  that  an  adversary  might  mount.  We  group  the  adversary  attacks 
into  three  categories:  Online  Attack  —  the  adversary  knows  the  user's  ID  and  at¬ 
tempts  to  guess  the  password.  The  adversary  will  get  locked  out  after  s  incorrect 
guesses  (strikes).  Offline  Attack  —  the  adversary  learns  both  the  cryptographic 
hash  of  the  user's  password  and  the  hash  function  and  can  try  many  guesses  tjSK. 
The  adversary  is  only  limited  by  the  resources  B  that  he  is  willing  to  invest  to  crack 
the  user's  password.  Plaintext  Password  Leak  Attack  —  the  adversary  directly  learns 
the  user's  password  for  an  account.  Once  the  adversary  recovers  the  password  pt 
the  account  A ,■  has  been  compromised.  However,  a  secure  password  management 
scheme  should  prevent  the  adversary  from  compromising  more  accounts. 

We  model  online  and  offline  attacks  using  a  guess-limited  oracle.  Let  S  c  [m] 
be  a  set  of  indices,  each  representing  an  account.  A  guess-limited  oracle  Os/(J  is 
a  blackbox  function  with  the  following  behavior:  1)  After  q  queries  0S/I/  stops 
answering  queries.  2)  Vi  $  S,  Os,q  (i,p)  -  ±  3)  Vi  £  S,  Os,q  (i,pi)  =  1  and  4) 
Vi  £  S,  p  V  pi,  Os,q  (i,  p)  =  0.  Intuitively,  if  the  adversary  steals  the  cryptographic 
password  hashes  for  accounts  {A,  |  i  £  S],  then  he  can  execute  an  offline  attack 
against  each  of  these  accounts.  We  also  model  an  online  attack  against  account  A, 
with  the  guess-limited  oracle  Oj,|/S  with  s  <sc  q  (e.g.,  s  =  3  models  a  three-strikes 
policy  in  which  a  user  is  locked  out  after  three  incorrect  guesses). 
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Game  Based  Definition  of  Security.  Our  cryptographic  game  proceeds  as  fol¬ 
lows: 

Setup:  The  user  'Ll  starts  with  knowledge  k  e  'K,  visitation  schedule  A  e  Rm,  a 
random  sequence  of  bits  b  €  {0,1}*  and  a  rehearsal  schedule  R  €  K.  The  user 
runs  Qm  (k,  b,  A,  R^j  to  obtain  m  passwords  pi,  and  public  cues  C\, ...,  cm  c  C  for 

accounts  A\,  The  adversary  A  is  given  k,  Qm,  A  and  c\, ...,  cm. 

Plaintext  Password  Leak  Attack:  A  adaptively  selects  a  set  S  c  [m\  s.t  |S|  <  r  and 
receives  p,  for  each  i  e  S. 

Offline  Attack:  A  adaptively  selects  a  set  S'  c  [m\  s.t.  |S'|  <  h,  and  is  given  blackbox 
access  to  the  guess-limited  offline  oracle  0S//(?. 

Online  Attack:  For  each  i  €  [m\  -  S,  the  adversary  is  given  blackbox  access  to  the 
guess-limited  offline  oracle  0(,)/S. 

Winner:  PA  wins  by  outputting  ( j,p),  where  j  €  [m]  -  S  and  p  =  pj. 

We  use  AdvWins  (k,  b,  A,  Qm,  PA )  to  denote  the  event  that  the  adversary  wins. 

Definition  4.  We  say  that  a  password  management  scheme  Qm  is  (q,  6,  m,  s,  r,  h)-secure 
if  for  every  k  e  W  and  adversary  strategy  PI  we  have 

Pr  [AdvWins  ( k , b, A, Qm, Plj j  <  5  . 

Discussion:  Observe  that  the  adversary  cannot  win  by  outputting  the  password 
for  an  account  that  he  already  compromised  in  a  plaintext  password  leak.  For 
example,  suppose  that  the  adversary  is  able  to  obtain  the  plaintext  passwords 
for  r  =  2  accounts  of  his  choosing:  p,  and  p,.  While  each  of  these  breaches  is 
arguably  a  success  for  the  adversary  the  user's  password  management  scheme 
cannot  be  blamed  for  any  of  these  breaches.  However,  if  the  adversary  can  use 
this  information  to  crack  any  of  the  user's  other  passwords  then  the  password 
management  scheme  can  be  blamed  for  the  additional  breaches.  For  example,  if 
our  adversary  is  also  able  to  use  p,  and  p;  to  crack  the  cryptographic  password  hash 
H  (pt)  for  another  account  At  in  at  most  q  guesses  then  the  password  management 
scheme  could  be  blamed  for  the  breach  of  account  At.  Consequently,  the  adversary 
would  win  our  game  by  outputting  ( t,pt ).  If  the  password  management  scheme 
is  (q,  10“4,  m,  s,  2,  Insecure  then  the  probability  that  the  adversary  could  win  is  at 
most  10-4  —  so  there  is  a  very  good  chance  that  the  adversary  will  fail  to  crack  pf. 

Economic  Upper  Bound  on  q.  Our  guessing  limit  q  is  based  on  a  model  of  a 
resource  constrained  adversary  who  has  a  budget  of  $B  to  crack  one  of  the  user's 


43 


passwords.  We  use  the  upper  bound  qB  =  $B/Cq,  where  Cq  =  $R//h  denotes  the 
amortized  cost  per  query  (e.g.,  cost  of  renting  ($R)  an  hour  of  computing  time  on 
Amazon's  cloud  [1]  divided  by  /h  —  the  number  of  times  the  cryptographic  hash 
function  can  be  evaluated  in  an  hour.)  We  experimentally  estimate  /h  for  SHA1, 
MD5  and  BCRYPT[122]  —  more  details  can  be  found  in  Appendix  7.5.  Assuming 
that  the  BCRYPT  password  hash  function  [122]  was  used  to  hash  the  passwords 
we  get  qB  =  B  (5.155  X  104)  —  we  also  consider  cryptographic  hash  functions  like 
SHA1,  MD5  in  Appendix  7.5.  In  our  security  analysis  we  focus  on  the  specific 
value  q$106  =  5.155  X  1010  —  the  number  of  guesses  the  adversary  can  try  if  he 
invests  $106  to  crack  the  user's  password. 


Sharing  and  Security.  In  Section  2.4  we  saw  that  sharing  public  cues  across 
accounts  improves  usability  by  (1)  reducing  the  number  of  cue-association  pairs 
that  the  user  has  to  memorize  and  rehearse,  and  (2)  increasing  the  rate  of  natural 
rehearsals  for  each  cue-association  pair.  However,  conventional  security  wisdom 
says  that  passwords  should  be  chosen  independently.  Is  it  possible  to  share  public 
cues,  and  satisfy  the  strong  notion  of  security  from  Definition  4?  Theorem  2 
demonstrates  that  public  cues  can  be  shared  securely  provided  that  the  public 
cues  {ci,. . .  ,cm)  are  a  (n,L,y)- sharing  set  family.  The  proof  of  Theorem  2  can  be 
found  in  Appendix  7.1. 

Definitions.  We  say  that  a  set  family  S  =  {Si, ...,  Sm}is(n,L,y)-sharingif(l)  |U™iS,|  = 
n,  (2)\Si\  =  £  for  each  S,-  €  S,  and  (3)  |s,  fi  S;|  <  y  for  each  pair  S;-  +  Sj  e  S. 

Theorem  2.  Let  {c1,...,cmj  be  a  (n, t, yfsharing  set  of  m  public  cues  produced  by  the 
password  management  scheme  Qm.  If  each  at  e  LftS  is  chosen  uniformly  at  random  then 
Qm  satisfies  (q,  5,  m,  s,  r,  hfsecurity  for  5  <  f_r,.  and  any  h. 

|DTo| 

Discussion:  To  maintain  security  it  is  desirable  to  have  L  large  (so  that  pass¬ 
words  are  strong)  and  y  small  (so  that  passwords  remain  strong  even  after  an 
adversary  compromises  some  of  the  accounts).  To  maintain  usability  it  is  desirable 
to  have  n  small  (so  that  the  user  doesn't  have  to  memorize  many  cue-association 
pairs).  There  is  a  fundamental  trade-off  between  security  and  usability  because  it 
is  difficult  to  achieve  these  goals  without  making  n  large. 

For  the  special  case  h  =  0  (e.g.,  the  adversary  is  limited  to  online  attacks)  the 
security  guarantees  of  Theorem  2  can  be  further  improved  to  5  <  because  the 
adversary  is  actually  limited  to  sm  guesses. 
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Public  Cue 


Private 


Figure  2.1:  PAO  Story  with  Cue 


c'  =C'  ...  c4  =  ct 

19  mod  9  1  limcxll'  " 


•  19  swallowing  +  piranha  +  •  ■  •  ♦  bike  +  kicking 


Figure  2.2:  Account  A19  using  Shared  Cues  with  the  (43, 4,  l)-sharing  set  family 
CRT  (90,9, 10, 11, 13).  For  convenience,  we  adopt  the  notation  c\  =  c,  and  A  = 

0+10+11+;- 


2.6  Our  Construction 

We  present  Shared  Cues —  a  novel  password  management  scheme  which  balances 
security  and  usability  considerations.  The  key  idea  is  to  strategically  share  cues  to 
make  sure  that  each  cue  is  rehearsed  frequently  while  preserving  strong  security 
goals.  Our  construction  may  be  used  in  conjunction  with  powerful  cue-based 
mnemonic  techniques  like  memory  palaces  [143]  and  person-action-object  stories 
[78]  to  increase  o  —  the  association  strength  constant.  We  use  person-action-object 
stories  as  a  concrete  example. 


Person- Action-Object  Stories.  A  random  person-action-object  (PAO)  story  for  a 
person  (e.g..  Bill  Gates)  consists  of  a  random  action  a  e  J\CT  (e.g.,  swallowing) 
and  a  random  object  o  £  O&CT  (e.g.,  a  bike).  While  PAO  stories  follow  a  very  simple 
syntactic  pattern  they  also  tend  to  be  surprising  and  interesting  because  the  story  is 
often  unexpected  (e.g..  Bill  Clinton  kissing  a  piranha,  or  Michael  Jordan  torturing 
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a  lion).  There  is  good  evidence  that  memorable  phrases  tend  to  use  uncommon 
combinations  of  words  in  common  syntactic  patterns  [61].  Each  cue  c  G  C  includes 
a  person  (e.g..  Bill  Gates)  as  well  as  a  picture.  To  help  the  user  memorize  the  story 
we  tell  him  to  imagine  the  scene  taking  place  inside  the  picture  (see  Figure  2.1 
for  an  example).  We  use  Algorithm  2.2  to  automatically  generate  random  PAO 
stories.  The  cue  c  could  be  selected  either  with  the  user's  input  (e.g.,  use  the  name 
of  a  friend  and  a  favorite  photograph)  or  automatically.  As  long  as  the  cue  c  is 
fixed  before  the  associated  action-object  story  is  selected  the  cue-association  pairs 
will  satisfy  the  independence  condition  of  Theorem  2. 


2.6.1  Constructing  (n,  £,  y)-sharing  set  families 

We  use  the  Chinese  Remainder  Theorem  to  construct  nearly  optimal  (n,t,y)- 
sharing  set  families.  Our  application  of  the  Chinese  Remainder  Theorem  is  differ¬ 
ent  from  previous  applications  of  the  Chinese  Remainder  Theorem  in  cryptogra¬ 
phy  (e.g.,  faster  RSA  decryption  algorithm  [64],  secret  sharing  [21]).  The  inputs 
fti,  ...,ne  to  Algorithm  2.1  should  be  co-prime  so  that  we  can  invoke  the  Chinese 
Remainder  Theorem  —  see  Figure  2.2  for  an  example  of  our  construction  with 
(ni,  n2,  n3/  n4)  =  (9, 10, 11, 13). 


Algorithm  2.1  CRT  (m,  n\, ...,  ri{) 

Input: 

m, 

and  ni, 

for  i 

1  — »  m  do 

-  0 

tot 

;  = 

1  — *  £  do 

Nj 

Si  < 

—  Sj  U  {(/'  mod  n,)  +  N?j 

return 

{Si, 

v  \  /  J  ) 

f  •  •  •  /  Sm} 
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Algorithm  2.2  CreatePAO Stories 

Input:  ft,  random  bits  b,  images  1),  and  names  Pi,  ...,P„. 
for  /  =  1  ->  n  do 

Ui  3{CT,0i  e-  OtBff  %Using  random  bits  b 
%Split  PAO  stories  to  optimize  usability 

for  i  =  1  — >  n  do 

Ci  <-  ((/;,  P i/ Act')  ,  (Ji+i  modn,Pi+l  mod  n/Obj')) 

Cl{  <  {fli,  0/+i  mod  n) 

return  {clr . . . ,  c„},  {fli,...,0„} 


Lemma  2  says  that  Algorithm  2.1  produces  a  (ft,  7,  y)-sharing  set  family  of  size 
m  as  long  as  certain  technical  conditions  apply  (e.g..  Algorithm  2.1  can  be  run 
with  any  numbers  n.\, ...,  n.(,  but  Lemma  2  only  applies  if  the  numbers  are  pairwise 
co-prime.). 

Lemma  2.  If  the  numbers  Wj  <  n2  <  ...  <  nf  are  pairwise  co-prime  and  m  <  np  ni 
then  Algorithm  2.1  returns  a  (£fi= i  nir£ry)-sharing  set  of  public  cues. 

Proof.  Suppose  for  contradiction  that  |S,  P|  Su\  >  y  +  1  for  i  <  k  <  m,  then  by  con¬ 
struction  we  can  find  y  +  1  distinct  indices  f, ...,  jy+ 1  G  such  that  i  =  k  mod  n  jL  for 
1  <  t  <  y  +  1.  The  Chinese  Remainder  Theorem  states  that  there  is  a  unique 
number  x*  s.t.  (1)  1  <  x*  <  II P'  njt>  and  (2)  x*  =  k  mod  njt  for  1  <  t  <  y  +  1. 

However,  we  have  i  <  m  <  np  nj,-  Hence,  i  =  x*  and  by  similar  reasoning  k  =  x*. 
Contradiction! 

□ 

Example:  Suppose  that  we  select  pairwise  co-prime  numbers  n\  =  9,  n2  =  10 ,n3  = 
11, 7?4  =  13,  then  CRT  (m,  n\, . . . ,  714)  generates  a  (43,4,  l)-sharing  set  family  of  size 
m  =  ni  x  n2  =  90  (i.e.  the  public  cues  for  two  accounts  will  overlap  in  at  most  one 
common  cue),  and  for  m  <  n\  x  n2  x  n3  =  990  we  get  a  (43, 4, 2)-sharing  set  family. 

Lemma  2  implies  that  we  can  construct  a  (n,  7,  y)-sharing  set  system  of  size 
m  >  Q  ((  n/tf  +1)  by  selecting  each  n,-  ~  n/I.  Theorem  3  proves  that  we  can't  hope 

to  do  much  better  —  any  (n,  7,  y)-sharing  set  system  has  size  m  <  O  {in/Cf  p.  We 
refer  the  interested  reader  to  Appendix  7.1  for  the  proof  of  Theorem  3  and  for 
discussion  about  additional  (ft,  £,  y)-sharing  constructions. 
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Theorem  3.  Suppose  that  S  =  {Si,  ...,Sm}  is  a  ( n,£,y)-sharing  set  family  of  size  m  then 

™  S  (,,«)/(, 4,). 

2.6.2  Shared  Cues 

Our  password  management  scheme  —  Shared  Cues —  uses  a  ( n ,  l,  y)-sharing  set 
family  of  size  m  (e.g.,  a  set  family  generated  by  Algorithm  2.1)  as  a  hardcoded  input 
to  output  the  public  cues  c\,...cm  c  C  and  passwords  p\,...rpm  for  each  account. 
We  use  algorithm  2.2  to  generate  the  underlying  cues  Ci,  ...,c„  e  C  and  their 
associated  PAO  stories.  The  computer  is  responsible  for  storing  the  public  cues 
in  persistent  memory  and  the  user  is  responsible  for  memorizing  and  rehearsing 
each  cue-association  pair  ( £j,dj ). 

We  use  two  additional  tricks  to  improve  usability:  (1)  Algorithm  2.2  splits 
each  PAO  story  into  two  parts  so  that  each  cue  c  consists  of  two  pictures  and  two 
corresponding  people  with  a  label  (action/object)  for  each  person  (see  Figure  2.2). 
A  user  who  sees  cue  c,  will  be  rehearsing  both  the  z'th  and  the  i  +  l'th  PAO  story, 
but  will  only  have  to  enter  one  action  and  one  object.  (2)  To  optimize  usability 
we  use  GreedyMap  (Algorithm  2.4)  to  produce  a  permutation  n  :  [m]  —>  [m]  over 
the  public  cues  —  the  goal  is  to  minimize  the  total  number  of  extra  rehearsals  by 
ensuring  that  each  cue  is  used  by  a  frequently  visited  account. 


Algorithm  2.3  SharedCues  [Si, . . . ,  Smr  ]  Qm 


Input:  k  e  'K,  b,  Ai, ...,  Am,  Rehearsal  Schedule  R. 

{ci,  ...,£„},  {fli,  CreatePAOStories  (n,  Ii, ...,  In,  Pi, ... ,  Pn) 

for  i  —  \  — >  m  do 


j  e  S,}. 


Cj  j  €  S/j,  and  p,  <- 
%  Permute  cues 
n  <—  GreedyMap  ( m ,  Ai, ...,  Am,  Ci, . . . ,  cmr  R,  a ) 


return  ^n(l))  /  •  •  •  /  ( Pn(m )>  Cn(m)j 

User:  Rehearses  the  cue-association  pairs  (c,,  d,)  by  following  the  rehearsal 
schedule  R. 

Computer:  Stores  the  public  cues  Ci, ...,  cm  in  persistent  memory. 


Once  we  have  constructed  our  public  cues  Ci,  ...,cm  c  C  we  need  to  create  a 
mapping  n  between  cues  and  accounts  Our  goal  is  to  minimize  the 

total  number  of  extra  rehearsals  that  the  user  has  to  do  to  satisfy  his  rehearsal 
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requirements.  Formally,  we  define  the  Min-Rehearsal  problem  as  follows: 
Instance:  Public  Cues  Ci,  ...,cm  c  C,  Visitation  Schedule  Ai,..., Am,  a  rehearsal 
schedule  R  for  the  underlying  cues  c  G  C  and  a  time  frame  t. 

Output:  A  bijective  mapping  n  :  {1, — »  {1  ,...rm)  mapping  account  A,  to 
public  cue  Sn^  which  minimizes  E  [XRf], 

Unfortunately,  we  can  show  that  Min-Rehearsal  is  NP-Hard  to  even  approximate 
within  a  constant  factor.  Our  reduction  from  Set  Cover  can  be  found  in  Appendix 
7.1  of  this  paper.  Instead  GreedyMap  uses  a  greedy  heuristic  to  generate  a  permu¬ 
tation  71. 

Theorem  4.  It  is  NP-Hard  to  approximate  Min-Rehearsal  within  a  constant  factor. 


Algorithm  2.4  GreedyMap 

Input:  m,  Ai, ...,  Am,  C\, . . . ,  cm,  Rehearsal  Schedule  R  (e.g.,  CR  or  ER  with  param¬ 
eter  a). 

Relabel:  Sort  A's  s.t  A,  >  A,+i  for  all  i  <m  —  1. 

Initialize:  7i0  (/)  <—  ±  for  j  <  m,  UsedCues  <—  0. 

%7i;  denotes  a  partial  mapping  [z]  — »  [m] ,  for  y  >  z,  the  mapping  is 
undefined  (e.g.,  m  ( j )  =  ±)  .  Let  Sjt  =  {c  |  c  £  ck} . 
for  z  =  1  — >  m  do 


for  all  j  £  [m\  -  UsedCues  do 


A,  <-  £e 

XRtf£ 

Aa  —  Ai  +  Ay 

-  E 

XRt,c 

A£=  £  Ay 

ceSj 

expected  reduction  in 

total  extra  rehearsal 

Is  if 

we  set  7t,(z)  = 

Ti i  (z)  <—  arg  max,  A„  UsedCues  <—  UsedCues  U  {n,  (z)} 

return  nm 


2.6.3  Usability  and  Security  Analysis 

We  consider  three  instantiations  of  Shared  Cues:  SC-0,  SC-1  and  SC-2.  SC-0 
uses  a  (9, 4, 3)-sharing  family  of  public  cues  of  size  m  =  126  —  constructed  by 
taking  all  Q  =  126  subsets  of  size  4.  SC-1  uses  a  (43,4,  l)-sharing  family  of 
public  cues  of  size  m  =  90  —  constructed  using  Algorithm  2.1  with  m  =  90  and 
(«!,  n2/  n3,  nf)  =  (9, 10, 11, 13).  SC-2  uses  a  (60, 5,  l)-sharing  family  of  public  cues  of 
size  m  =  90  —  constructed  using  Algorithm  2.1  with  m  =  90  and  (zq,  n2r  n3r  nir  n5 )  = 
(9,10,11,13,17). 
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Assumption 

CR  (a  =  1) 

ER  (0  =  1) 

Schedule/Scheme 

SC-0 

SC-1 

SC-2 

SC-0 

SC-1 

SC-2 

Very  Active 

«  0 

1,309 

2,436 

«  0 

3.93 

7.54 

Typical 

*  0.42 

3,225 

5,491 

«  0 

10.89 

19.89 

Occasional 

*  1.28 

9,488 

6,734 

*  0 

22.07 

34.23 

Infrequent 

*  723 

13,214 

18,764 

«  2.44 

119.77 

173.92 

Table  2.3:  E  [XR365]:  Extra  Rehearsals  over  the  first  year  for  SC-0,SC-1  and  SC-2. 


Offline  Attack? 

h  =  0 

h  >  0 

(n,  £,  y)-sharing 

r  =  0 

r  =  1 

r  =  2 

r  =  0 

r  =  1 

r  =  2 

in,  4,3)  (e.g.,  SC-0) 

2  x  10“lb 

0.011 

1 

3.5  x  10“7 

1 

1 

(n,4,l)  (e.g.,  SC-1) 

2  x  10“lb 

4  x  10-n 

8  x  lO”7 

3.5  x  lO"7 

0.007 

1 

(n,  5, 1)  (e.g.,  SC-2) 

1  x  10“iy 

2  x  10~lb 

4  x  10-11 

1.8  x  10-11 

3.5  x  lO"7 

0.007 

Table  2.4:  Shared  Cues  (g$106, 5,  m,  s,  r,  /z)-Security:  <5  vs  h  and  r  using  a  (n,  L,  y)- 
sharing  family  of  m  public  cues. 


Our  usability  results  can  be  found  in  Table  2.3  and  our  security  results  can  be 
found  in  Table  2.4.  We  present  our  usability  results  for  the  very  active,  typical, 
occasional  and  infrequent  internet  users  (see  Table  2.1  for  the  visitation  schedules) 
under  both  sufficient  rehearsal  assumptions  CR  and  ER.  Table  2.3  shows  the  values 
of  E  [XR365]  —  computed  using  the  formula  from  Theorem  1  —  for  SC-0,  SC-1  and 
SC-2.  We  used  association  strength  parameter  o  =  1  to  evaluate  each  password 
management  scheme  —  though  we  expect  that  a  will  be  higher  for  schemes  like 
Shared  Cues  that  use  strong  mnemonic  techniques  4. 

Our  security  guarantees  for  SC-0,SC-1  and  SC -2  are  illustrated  in  Table  2.4. 
The  values  were  computed  using  Theorem  2.  We  assume  that  \LRS\  =  1402  where 
VLS  =  SRCT  x  OBJ  (e.g.,  their  are  140  distinct  actions  and  objects),  and  that  the 
adversary  is  willing  to  spend  at  most  $106  on  cracking  the  user's  passwords  (e.g., 
q  =  q$106  =  5.155  x  1010).  The  values  of  <5  in  the  h  =  0  columns  were  computed 
assuming  that  m  <  100. 

Discussion:  Comparing  Tables  2.3  and  2.2  we  see  that  Lifehacker  is  the  most 
usable  password  management  scheme,  but  SC-0  compares  very  favorably!  Unlike 
Lifehacker,  SC-0  provides  provable  security  guarantees  after  the  adversary  phishes 
one  account  —  though  the  guarantees  break  down  if  the  adversary  can  also  ex- 

4We  explore  the  effect  of  a  on  E  [XRf  i  |  in  Appendix  7.6. 


50 


ecute  an  offline  attack.  While  SC-1  and  SC-2  are  not  as  secure  as  Strong  Random 
and  Independent  —  the  security  guarantees  from  Strong  Random  and  Independent  do 
not  break  down  even  if  the  adversary  can  recover  many  of  the  user's  plaintext 
passwords  —  SC-1  and  SC-2  are  far  more  usable  than  Strong  Random  and  Inde¬ 
pendent.  Furthermore,  SC-1  and  SC-2  do  provide  very  strong  security  guarantees 
(e.g.,  SC -2  passwords  remain  secure  against  offline  attacks  even  after  an  adversary 
obtains  two  plaintext  passwords  for  accounts  of  his  choosing).  For  the  very  active, 
typical  and  occasional  user  the  number  of  extra  rehearsals  required  by  SC-1  and 
SC-2  are  quite  reasonable  (e.g.,  the  typical  user  would  need  to  perform  less  than 
one  extra  rehearsal  per  month).  The  usability  benefits  of  SC-1  and  SC -2  are  less 
pronounced  for  the  infrequent  user  —  though  the  advantage  over  Strong  Random 
and  Independent  is  still  significant. 


2.7  Discussion  and  Future  Work 

We  conclude  by  discussing  future  directions  of  research. 


Sufficient  Rehearsal  Assumptions.  While  there  is  strong  empirical  evidence  for 
the  Expanding  Rehearsal  assumption  in  the  memory  literature  (e.g.,  [160]),  the 
parameters  we  use  are  drawn  from  prior  studies  in  other  domains.  In  Chapter 
4  we  present  preliminary  results  from  user  studies  we  are  conducting  to  test  the 
Expanding  Rehearsal  assumption  in  the  password  context,  and  obtain  parameter 
estimates  specific  to  the  password  setting. 


Expanding  Security  over  Time.  Most  extra  rehearsals  occur  soon  after  the 
user  memorizes  a  cue-association  pair  —  when  the  rehearsal  intervals  are  still 
small.  Is  it  possible  to  start  with  a  password  management  scheme  with  weaker 
security  guarantees  (e.g.,  SC-0),  and  increase  security  over  time  by  having  the  user 
memorize  additional  cue-association  pairs  as  time  passes? 


Secure  Password  Recovery  Mechanism.  Recently,  we  proposed  a  password  re¬ 
covery  mechanism  which  would  allow  users  who  forget  a  few  of  their  stories  to 
recover  them  provided  that  they  can  still  remember  a  couple  of  their  other  stories 
[164].  For  example,  suppose  that  we  have  the  hash  of  the  user's  first  six  PAO 
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stories  H  (ci\0\ . . .  a6o6).  The  entropy  of  the  string  a1o1 . . .  a6o6  is  high  enough  that 
no  adversary  who  manages  to  obtain  this  hash  will  be  able  to  crack  the  password. 
However,  if  the  user  can  remember  five  of  these  stories  then  it  is  trivial  for  our 
password  recovery  mechanism  to  find  the  sixth  story  by  brute  force  search.  By 
storing  enough  cryptographic  hashes  we  can  ensure  that  our  user  can  recover  any 
story  that  he  forgets  provided  that  he  can  remember  any  five  of  his  other  stories. 
These  hashes  could  be  deleted  after  the  each  of  the  stories  are  firmly  entrenched 
in  the  user's  memory  (e.g.,  after  the  user  has  rehearsed  each  of  his  stories  many 
times). 


Login  Time.  One  potential  usability  drawback  of  Shared  Cues  is  that  it  might 
take  a  few  seconds  to  type  in  each  of  his  passwords  because  they  are  long  (e.g., 
one  password  consists  of  four  actions  and  four  objects).  One  way  to  save  time 
during  authentication  would  be  to  instruct  the  user  to  form  his  password  from 
the  first  three  characters  in  each  action  and  each  object  instead  of  typing  in  each 
word  completely  (most  actions  and  objects  in  our  set  could  be  uniquely  identified 
from  the  first  two  or  three  characters  in  the  word).  Alternatively,  we  could  use 
auto-completion  to  help  the  user  type  in  his  passwords  faster.  Because  we  assume 
that  the  adversary  already  knows  the  set  of  actions  and  objects  that  we  are  using 
we  would  not  reduce  the  adversary's  search  space  by  using  only  the  first  three 
characters  of  each  word  or  by  using  auto-completion. 


Human  Computable  Passwords.  Shared  Cues  only  relies  on  the  human  capacity 
to  memorize  and  retrieve  information,  and  is  secure  against  at  most  r  =  ijy  plain¬ 
text  password  leak  attacks.  Could  we  improve  security  (or  usability)  by  having 
the  user  perform  simple  computations  to  recover  his  passwords?  In  Chapter  3  we 
present  a  candidate  Human  Computable  Password  scheme  and  provide  strong  ev¬ 
idence  that  this  scheme  will  remain  secure  even  after  many  (e.g.,  50-100)  plaintext 
password  breaches. 
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Chapter  3 

Human  Computable  Passwords 


3.1  Introduction 

Secure  cryptographic  protocols  to  authenticate  humans  typically  assume  that  the 
human  will  receive  assistance  from  trusted  hardware  or  software.  One  interesting 
challenge  for  the  cryptography  community  is  to  build  authentication  protocols 
that  are  so  simple  that  a  human  can  execute  them  without  relying  on  assistance 
from  a  trusted  computer.  In  this  chapter  we  propose  several  candidate  human 
authentication  protocols  in  a  setting  in  which  the  user  can  only  receive  assistance 
from  a  semi-trusted  computer  —  a  computer  that  can  be  trusted  to  store  informa¬ 
tion  and  perform  computations  correctly,  but  cannot  be  trusted  to  ensure  privacy. 
In  our  schemes,  a  semi-trusted  computer  is  used  to  store  and  display  public  chal¬ 
lenges  C,  G  [n]k.  The  user  memorizes  a  random  secret  mapping  a  :  [n]  — »  Zj  and 
authenticates  by  computing  responses  /(cr(C,))  to  a  sequence  of  public  challenges, 
where  /  :  Zk  — >  Z&  is  a  function  that  is  easy  for  the  human  to  evaluate.  We  prove 

that  any  statistical  adversary  needs  to  sample  m  =  Q  (ns^  challenge-response 
pairs  to  recover  a  —  for  a  security  parameter  s(f)  that  depends  on  two  key  prop¬ 
erties  of  f1.  Our  lower  bound  generalizes  recent  results  of  Feldman  et  al.  [73], 
who  proved  analogous  results  for  the  special  case  d  =  2.  To  obtain  our  results  we 
apply  the  general  hypercontractivity  theorem  [116]  to  lower  bound  the  statistical 

1  Ou r  guarantees  are  not  information  theoretic.  Indeed,  a  computationally  unbounded  adver¬ 
sary  would  need  to  see  at  most  O  ( n )  challenge-response  pairs  to  break  any  such  human  computable 
password  scheme.  Our  lower  bounds  provide  strong  evidence  that  any  polynomial  time  adversary 
will  need  at  least  m  =  Q  challenge-response  pairs  —  even  if  s (/)  >  1. 
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dimension  of  the  distribution  over  challenge-response  pairs  induced  by  /  and  a. 
Our  statistical  dimension  lower  bounds  apply  to  arbitrary  functions  /  :  TOK  — >  Z,f 
—  not  just  functions  that  are  easy  for  a  human  to  evaluate  —  and  may  be  of  in¬ 
dependent  interest.  For  our  particular  schemes,  we  show  that  forging  passwords 
is  equivalent  to  recovering  the  secret  mapping.  We  also  show  that  s(/i)  =  1.5  for 
our  first  scheme  and  that  s(/2)  =  2  in  our  second  scheme.  Thus,  our  human  com¬ 
putable  password  schemes  can  maintain  strong  security  guarantees  even  after  an 
adversary  has  observed  the  user  login  to  many  different  accounts  (e.g.,  100).  We 
also  issue  a  public  challenge  to  the  cryptography  community  to  crack  passwords 
that  were  generated  using  our  human  computable  password  schemes. 

In  Chapter  2  we  initiated  the  rigorous  study  of  usable  and  secure  password 
management  schemes  —  systematic  strategies  to  help  users  create  and  remember 
multiple  passwords.  Shared  Cues,  the  proposed  password  management  scheme 
from  Chapter  2,  balances  security  and  usability  considerations.  However,  Shared 
Cues  only  maintains  security  for  a  small  (constant)  number  of  plaintext  password 
breaches  (e.g.,  1  to  4).  An  adversary  who  has  seen  several  of  the  user's  passwords 
might  be  able  to  break  the  user's  passwords  at  other  accounts.  This  raises  an 
important  question:  Is  it  possible  to  design  a  human  authentication  protocol  that 
allows  a  user  to  authenticate  to  multiple  untrusted  parties  and  will  remain  secure 
even  after  many  breaches  (e.g.,  50  to  100)? 

In  this  chapter  the  goal  is  to  develop  a  secure  human  computable  password 
management  scheme  in  which  security  guarantees  are  maintained  after  many 
breaches.  In  a  human  computable  password  management  scheme  the  user  re¬ 
constructs  each  of  his  passwords  by  computing  the  response  to  a  public  challenge. 
The  computation  may  only  involve  a  few  very  simple  operations  (e.g.,  addition 
modulo  10)  over  secret  values  (digits)  that  the  user  has  memorized.  More  specif¬ 
ically,  in  our  candidate  human  computable  password  schemes  the  user  learns  to 
compute  a  simple  function  /  :  Zj)  — >  Z^  (in  our  candidate  schemes  we  adopt  the 
base  d  =  10  that  is  natural  for  most  humans),  and  memorizes  a  secret  mapping 
o  :  [n]  — >  Z d.  The  user  authenticates  by  responding  to  a  sequence  of  single  digit 
challenges  —  a  challenge-response  pair  (C,/(cr(C)))  is  a  challenge  C  e  Xt  c  [n]k 
and  the  corresponding  response. 

Our  first  candidate  human  computable  password  scheme  uses  the  function 

fl  (Xo,Xi,X2A3A4A5,  •  •  •  ,*13)  -  *13  +  *12  +  *(x10+xn  mod  10)  mod  10  . 

To  evaluate  this  function  a  human  would  only  need  to  perform  three  addition 
operations  modulo  10.  While  this  function  is  quite  simple  we  show  that  the  attacker 
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would  need  to  see  Q  (n15 J  challenge-response  pairs  before  he  can  forge  the  user's 
passwords  (accurately  predict  the  responses  to  randomly  selected  challenges).  In 
particular,  if  we  ask  the  user  to  memorize  a  secret  mapping  of  length  n  =  100 
and  the  password  for  each  account  is  ten  digits  then  the  adversary  would  need  to 
breach  about  one-hundred  of  the  user's  accounts  before  he  could  obtain  enough 
challenge-response  pairs  (1001 5  =  10(100))  to  forge  the  user's  passwords. 

As  in  Chapter  2  we  consider  a  setting  where  a  user  has  two  types  of  memory: 
persistent  memory  (e.g.,  a  sticky  note  or  a  text  file  on  his  computer)  and  associative 
memory  (e.g.,  his  own  human  memory).  We  assume  that  persistent  memory  is 
reliable  and  convenient  but  not  private  (i.e.,  accessible  to  an  adversary).  In  contrast, 
a  user's  associative  memory  is  private  but  lossy — if  the  user  does  not  rehearse  a 
memory  it  may  be  forgotten.  Thus,  the  user  can  store  a  password  challenge  C  €  Xjt 
in  persistent  memory,  but  the  mapping  o  must  be  stored  in  associative  memory 
(e.g.,  memorized  and  rehearsed).  We  allow  the  user  to  receive  assistance  from 
a  semi-trusted  computer.  A  semi-trusted  computer  will  perform  computations 
accurately  (e.g.,  it  can  be  trusted  to  show  the  user  the  correct  challenge),  but  it  will 
not  ensure  privacy  of  its  inputs  or  outputs.  This  means  that  a  human  computable 
password  management  scheme  should  be  based  on  a  function  /  that  the  user  can 
compute  entirely  in  his  head. 


Contributions.  We  develop  a  general  framework  for  analyzing  the  security  of  a 
human  computable  password  management  scheme  and  we  propose  two  candi¬ 
date  human  computable  password  management  schemes.  We  give  evidence  that 
our  schemes  remain  secure  until  the  adversary  has  seen  at  least  Q  (VfOj  challenge- 
response  pairs  (C,/(cr  (C))).  Here,  s(f)  =  min{r(/)/2,g(/)  +  1}  is  a  composite  secu¬ 
rity  parameter  which  captures  g(f)  (how  many  inputs  to  /  need  to  be  fixed  to  make 
/  linear?)  and  r(f)  (what  is  the  largest  value  of  r  such  that  the  distribution  over 
challenge-response  pairs  are  (r  -  l)-wise  independent?).  We  show  that  s(f)  =  1.5 
for  our  first  scheme  and  s(f)  =  2  for  our  second  scheme.  In  particular  we  prove  that 
any  statistical  adversary  needs  to  see  at  least  Q  (nr(f)/2j  challenge-response  pairs 
(C,  /  ( o  (C)))  before  he  can  even  approximately  recover  the  secret  mapping  o.  Our 
lower  bound  is  based  on  the  statistical  dimension  of  the  distribution  over  challenge- 
response  pairs  induced  by  /  and  o.  We  stress  that  our  analysis  of  the  statistical 
dimension  applies  to  arbitrary  functions  /  :  — >  Zj,  not  just  functions  that  are 
easy  for  humans  to  compute.  Our  analysis  of  the  statistical  dimension  general¬ 
izes  recent  results  of  Feldman  et  al.  [73],  which  only  applied  to  binary  predicates 
(e.g.,  d  =  2),  and  may  be  of  independent  interest.  Because  our  function  /  is  not  a 
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binary  predicate  we  cannot  use  the  Walsh  basis  functions  to  express  the  Fourier 
decomposition  of  /  and  analyze  the  statistical  dimension  of  our  distribution  over 
challenge-response  pairs  as  Feldman  et  al.  [73]  do.  Instead,  we  use  a  general¬ 
ized  set  of  Fourier  basis  functions  to  take  the  Fourier  basis  decomposition  of  /, 
and  we  apply  the  general  hypercontractivity  theorem  [116]  to  obtain  our  bounds 
on  the  statistical  dimension.  Furthermore,  we  show  that  forging  passwords  and 
approximately  recovering  the  secret  mapping  are  equivalent  for  any  'reasonable' 
candidate  human  computable  password  scheme.  This  means  that  any  adversary 
who  can  predict  the  response  /(C)  to  a  random  challenge  C  with  better  accuracy 
than  random  guessing  can  be  used  as  a  blackbox  to  approximately  recover  the 
secret  mapping.  These  results  imply  that  any  statistical  adversary  needs  to  see 
at  least  Q  (n'’C)/2j  challenge-response  pairs  before  he  can  accurately  forge  pass¬ 
words.  This  is  significant  because  almost  all  known  algorithmic  techniques  have 
statistical  analogues.  In  particular  techniques  like  Expectation  Maximization[62], 
local  search,  MCMC  optimization[83],  first  and  second  order  methods  for  convex 
optimization,  PCA,  ICA,  k-means  can  be  modeled  as  statistical  algorithms  —  see 
[37]  and  [55]  for  proofs.  While  Gaussian  Elimination  is  a  notable  exception  our 
composite  security  parameter  accounts  for  attacks  based  on  Gaussian  Elimination 
—  we  show  that  an  adversary  needs  to  see  m  =  Q  (n1+‘?(/)j  challenge-response  pairs 
to  recover  o  using  Gaussian  Elimination.  To  analyze  the  usability  of  our  candidate 
human  computable  password  schemes  we  use  the  usability  model  from  Chapter  2 
to  quantify  the  effort  that  a  user  must  expend  to  memorize  and  rehearse  the  secret 
mapping  a,  and  we  use  step  counting  to  estimate  the  effort  that  a  user  must  ex¬ 
pend  to  compute  each  password.  We  also  propose  a  mnemonic  tool  to  help  users 
memorize  their  secret  mapping  o.  Finally,  we  constructed  public  challenges  for 
cryptographers  to  break  our  human  computable  password  management  schemes 
under  various  parameters  (e.g.,  n  =  100,  m  =  1000). 


Organization.  The  rest  of  the  paper  is  organized  as  follows:  We  first  explore  re¬ 
lated  work  in  Section  3.2.  We  then  introduce  preliminary  notation  and  definitions 
in  Section  3.3.  We  present  our  main  technical  results  in  Section  3.4  including  an 
overview  of  our  lower  bound  for  statistical  adversaries.  We  use  these  results  to 
provide  general  security  bounds  for  a  human  computable  password  scheme  in 
Section  3.5.  We  introduce  our  candidate  human  computable  password  schemes 
in  Section  3.6  and  analyze  the  security  and  usability  of  these  schemes.  We  con¬ 
clude  in  Section  3.7  by  presenting  our  human  computable  password  challenge  and 
discussing  how  a  human  computable  password  scheme  could  be  used  to  defend 
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against  an  adversary  who  can  always  observe  the  user  when  he  logs  into  any  of  his 
accounts  (e.g.,  every  time  the  user  computes  the  response  /  ( o  (C))  to  a  single-digit 
challenge  the  adversary  observes  the  pair  (C,f  (a  (C)))). 


3.2  Related  Work 

The  literature  on  passwords  has  grown  rapidly  over  the  past  decade.  One  line  of 
prior  work  has  focused  on  the  effects  of  password  composition  rules  (e.g.,  requiring 
a  password  to  contain  capital  letters  and  numbers)  on  individual  passwords  [34, 
101].  Another  line  of  prior  work  has  focused  on  empirical  studies  of  user  behavior 
in  password  management  [39,  52,  75,  102]  (e.g..  How  many  different  passwords 
do  people  have?  How  often  do  users  reuse  the  same  password?  How  strong  are 
the  passwords  that  people  pick?).  These  studies  consistently  paint  a  grim  picture. 
Many  of  the  passwords  that  users  select  have  low  entropy  and  users  frequently 
reuse  their  passwords.  Shay  et  al.  [140]  empirically  studied  the  usability  of  system 
assigned  passwords  and  found  that  users  often  had  difficulty  remembering  system 
assigned  passwords.  Some  researchers  have  considered  replacing  text  passwords 
with  graphical  passwords  [27,  48]  driven  by  evidence  that  humans  have  a  large 
capacity  for  visual  memories  [145]  and  that  cued-recall  is  easier  than  pure  recall 
[23].  Fundamentally,  both  graphical  passwords  and  text  passwords  rely  solely 
on  the  user's  ability  to  remember  something  (e.g.,  a  string,  a  face  or  a  location 
on  a  picture).  Many  security  metrics  have  been  proposed  to  analyze  the  security 
of  a  dataset  of  passwords  or  to  estimate  the  security  of  an  individual  password 
[39, 45, 107, 121].  While  these  metrics  can  provide  useful  feedback  about  individual 
passwords  (e.g.,  they  rule  out  some  insecure  passwords)  they  do  not  deal  with  the 
complexities  of  securing  multiple  accounts  against  an  adversary  (e.g.,  they  don't 
consider  correlations  between  a  user's  passwords). 

In  Chapter  2  we  considered  the  problem  of  developing  usable  and  secure  pass¬ 
word  management  schemes  —  strategies  for  creating  and  remembering  multiple 
passwords.  We  use  the  same  usability  model  in  this  chapter  to  quantify  the  effort 
that  a  user  will  need  to  expend  to  remember  his  secret  mapping  in  our  human 
computable  password  schemes.  We  emphasize  two  key  differences  between  the 
work  in  the  previous  chapter  and  the  work  in  the  previous  chapter.  First,  Shared 
Cues,  the  password  management  scheme  from  Chapter  2,  only  maintains  security 
for  a  small  (constant)  number  of  plaintext  password  breaches,  while  our  goal  in 
this  chapter  is  to  design  protocols  that  maintain  security  guarantees  even  after 
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many  password  breaches.  There  are  scenarios  in  which  it  may  not  be  reasonable 
to  assume  that  the  adversary  can  only  compromise  a  small  number  of  the  user's 
passwords  (e.g.,  if  the  user's  computer  is  infected  with  malware  for  a  few  days). 
Second,  the  Shared  Cues  scheme  only  requires  users  to  remember  several  cue- 
association  pairs  to  reconstruct  their  passwords  while  the  password  management 
schemes  we  consider  in  this  chapter  require  users  to  perform  a  few  additional 
computations  in  their  head  to  reconstruct  their  passwords. 

Hopper  and  Blum  [91]  designed  a  Human  Identification  Protocol  based  on  a 
the  noisy  parity  problem  —  a  learning  problem  that  is  believed  to  be  hard.  Juels 

and  Weis  [93]  modified  the  protocol  of  Hopper  and  Blum  to  design  HB-l - a 

lightweight  authentication  protocol  for  pervasive  devices  like  smartcards.  Subse¬ 
quent  work  has  explored  the  security  of  the  HB+  protocol  under  various  threat 
models  (e.g.,  man-in-the-middle  attacks[47,  84],  concurrent  composition[95]).  We 
emphasize  a  few  fundamental  differences  between  our  work  and  the  work  of 
Hopper  and  Blum.  First,  they  focus  on  the  authentication  setting  where  a  human 
authenticates  to  a  single  trusted  party  with  a  shared  secret.  By  contrast,  we  focus 
on  the  setting  where  a  human  user  wishes  to  authenticate  to  multiple  (possibly  un¬ 
trusted)  parties  without  sharing  his  secret  (e.g.,  by  only  sharing  the  cryptographic 
hashes  of  each  password  he  computes).  Second,  computations  in  their  protocol 
are  randomized  (e.g.,  the  human  occasionally  flips  his  answer),  while  the  com¬ 
putations  in  our  protocol  are  deterministic.  This  is  significant  because  humans 
are  not  good  at  consciously  generating  random  numbers  [74,  111,  154]  (e.g.,  noisy 
parity  could  be  easy  to  learn  when  humans  are  providing  source  of  noise).  It  also 
means  that  their  protocol  would  need  to  be  modified  in  our  setting  so  that  the 
untrusted  third  party  could  validate  a  noisy  response  using  only  a  cryptographic 
hash  of  the  answer  —  invoking  error  correcting  codes  would  increase  the  number 
of  rounds  needed  to  provide  an  acceptable  level  of  security.  Finally,  we  focus  on 
computations  of  very  simple  functions  over  a  constant  number  of  variables  so  that 
a  human  can  compute  the  response  to  each  challenge  quickly. 

Naor  and  Pinkas[112]  proposed  using  visual  cryptography [113]  to  address  a 
related  problem:  how  can  a  human  verify  that  a  message  he  received  from  a  trusted 
server  has  not  been  tampered  with  by  an  adversary?  Their  protocol  requires  the 
human  to  carry  a  visual  transparency  (a  shared  secret  between  the  human  and  the 
trusted  server  in  the  visual  cryptography  scheme),  which  he  will  use  to  verify  that 
messages  from  the  trusted  server  have  not  been  altered. 

A  related  goal  in  cryptography,  constructing  pseudorandom  generators  in 
NC°,  was  proposed  by  Goldreich  [86]  and  by  Cryan  and  Miltersen  [58].  In 
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Goldreich's  construction  we  fix  C\, . . . ,  Cm  G  [n]k  once  and  for  all,  and  a  bi¬ 
nary  predicate  P  :  {0, 1  \k  — »  {0,1}.  The  pseudorandom  generator  is  a  function 
G  :  {0,1}"  — »  {0,1}'",  whose  z'th  bit  G(x)[z]  is  given  by  P  applied  to  the  bits  of 
x  specified  by  C,.  O'Donnel  and  Witmer  gave  evidence  that  the  "Tri-Sum- And" 
predicate  (TSA(x i,  ...,x5)  =  Xi  +  x2  +  x3  +  x4x5  mod  2)  provides  near-optimal 
stretch.  In  particular,  they  showed  that  for  m  =  n15~e  Goldreich's  construction 
with  the  TSA  predicate  is  secure  against  subexponential-time  attacks  using  SDP 
hierarchies.  Our  candidate  human-computable  password  schemes  use  functions 
/  :  Z^0  — >  Zio  instead  of  binary  predicates.  While  our  candidate  functions  are 
contained  in  NC°,  we  note  that  an  arbitrary  function  in  NC°  is  not  necessarily 
human  computable. 

Feldman  et  al.  [73]  considered  the  problem  of  finding  a  planted  solution  in  a 
random  binary  satisfiability  problem.  They  showed  that  any  statistical  algorithm 
—  a  class  of  algorithms  that  covers  almost  all  known  algorithmic  techniques  — 
needs  to  see  at  least  Q  [n1'1'2^  random  clauses  to  efficiently  identify  the  planted 
solution  when  the  distribution  over  clauses  are  (r — l)-wise  independent2.  Feldman 
et  al.  [73]  also  demonstrate  that  O  (nr^  clauses  are  sufficient.  We  extend  the 
analysis  of  Feldman  et  al.  [73]  to  cover  non-binary  planted  satisfiability  problems, 
and  argue  that  our  candidate  human  computable  password  schemes  are  secure. 


3.3  Definitions 

3.3.1  Notation 

Given  two  strings  ai,ct2  €  Z”  we  use  H  (ai,a2)  —  \{i  €  [ n \  \  a,\[i ]  ±  u2[z]}|  to  denote 
the  Hamming  distance  between  them.  We  will  also  use  H(a i)  =  Pf(ai,o)  to 
denote  the  Hamming  weight  of  a.\.  We  use  o  :  [n]  — »  to  denote  a  secret  random 
mapping  that  the  user  will  memorize.  We  will  sometimes  abuse  notation  and 
think  of  o  G  Z^  as  a  string  which  encodes  the  mapping,  and  we  will  use  o  ~  Z"  to 
denote  a  random  mapping  chosen  from  Z"  uniformly  at  random. 

2  We  note  that  after  we  have  seen  0(n  log  n)  random  clauses  the  planted  solution  is  —  with  high 
probability  —  the  only  solution  which  satisfies  all  of  the  random  clauses.  The  results  of  Feldman 
et  al.  [73]  are  evidence  that  we  need  Q  (rA2)  examples  to  find  the  planted  solution  efficiently.  We 
also  note  that  r  =  3  for  the  uniform  distribution  over  clauses  that  satisfy  the  TSA  predicate,  which 
provides  further  evidence  that  the  Goldreich's  PRG  is  secure  for  m  =  n13~e. 
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Definition  6.  We  say  that  two  mappings  0\,  o2  £  Z”  are  e-correlated  if  H(-af02'1  <  tLl  _  £/ 
and  zee  say  that  a  mapping  o  £  Z"  zs  b-balanced  if 


max 

ie{0,...,d— 1} 


d-  1 


<6. 


;z 


Note  that  for  a  random  mapping  <j2  we  expect  (Ji  and  cr2  to  differ  at  E a2~z,»  [H  (a i,  cr2)] 
locations,  and  for  a  random  mapping  a  and  i  ~  {0, . . . ,  d  -  1}  we  expect  cr  to 


differ  from  z  at  E,. 


a  random  mapping  cr2  wf 


h(o,?J  =  n  locations.  Thus,  for  any  constant  e  >  0 
1  not  be  e-correlated  with  0\  with  probability  1  -  o(l),  but 
for  any  constant  5  >  0  a  random  mapping  o  will  be  6-balanced  with  probability 
l-o(l). 


We  let  Xk  c  [n]k  denote  the  space  of  ordered  clauses  of  k  variables  without 
repetition.  We  use  C  ~  Xk  to  denote  a  clause  C  chosen  uniformly  at  random  from 
Xk  and  we  use  o  (C)  £  Z^  to  denote  the  values  of  the  corresponding  variables 
in  C.  For  example,  if  d  =  10,  C  =  (3,10,59)  and  cr(z')  =  (z  +  1  mod  10)  then 
u(C)  =  (4,1,0). 


We  view  each  clause  C  £  X^  as  a  single-digit  challenge.  The  user  responds  to  a 
challenge  C  by  computing  /  (cr  (C)),  where  /  :  Z*  — »  Zrf  is  a  human  computable  func¬ 
tion  (see  discussion  below)  and  a  :  [n]  — >  Zrf  is  the  secret  mapping  that  the  user 
has  memorized.  For  example,  if  d  =  10,  C  =  (3, 10,59),  a  ( i )  =  (z  +  1  mod  10)  and 
/  (x,  y,  z)  =  (x  -  1/  +  z  mod  10)  then  /  (cr  (C))  =  (4  -  1  +  0  mod  10)  =  3.  A  length-f 
password  challenge  C  =  (C\r  ...,Cf)  £  (X*y  is  a  sequence  of  f  single  digit  chal¬ 
lenges,  and  /(cr(c))  =  (/(cr  (Ci)) ,...,/ (cr  (Cf)))  £  Z^  denotes  the  corresponding 
response  (e.g.,  a  password). 

Let's  suppose  that  the  user  has  m  accounts  Ai, . . .  ,Am.  In  a  human  com¬ 
putable  password  management  scheme  we  will  generate  m  length-f  password 
challenges  C\, ... ,  Cm  £  (Xk) .  These  challenges  will  be  stored  in  persistent  mem¬ 
ory  so  they  are  always  accessible  to  the  user  as  well  as  the  adversary.  When  our 
user  needs  to  authenticate  to  account  A;  he  will  be  shown  the  length-f  password 
challenge  Q  =  (c\,. . .  ,Cl^j.  The  user  will  respond  by  computing  his  password 

p«  =  </HcD) . /Hc  0)>eZi- 
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3.3.2  Requirements  for  a  Human  Computable  Function 

In  our  setting  we  require  that  the  composite  function  /  o  o  :  Xk  — »  Zrf  is  hu¬ 
man  computable.  A  human  computable  function  might  involve  several  memory 
lookups  (e.g.,  we  can  ask  the  user  to  recall  the  value  cr(z'))  as  well  as  several  simple 
operations.  However,  if  we  want  the  function  /  o  a  to  be  human  computable  then 
we  cannot  ask  the  user  to  perform  too  many  operations. 

Requirement  1.  A  function  f  is  f-human  computable  for  a  human  user  H  if  H  can 
evaluate  f  in  his  head  in  t  seconds. 


Example:  The  function  f(x,  y)  —  x  +  y  mod  10  is  1-human  computable  for  many 

humans. 


Discussion  Informally  we  say  that  a  function  /  is  human-computable  if  a  human 
user  can  evaluate  /  quickly  in  his  head.  Intuitively,  evaluation  of  a  human  com¬ 
putable  function  must  only  involve  a  few  operations  —  otherwise  a  human  will 
not  be  able  to  evaluate  the  function  quickly.  Furthermore,  the  operations  must  be 
extremely  simple.  A  human  computable  function  must  only  involve  operations 
with  a  very  low  memory  footprint  as  a  typical  person  can  only  keep  7  +  2  'chunks' 
of  information  in  short-term  memory  [108]  at  any  given  time.  If  the  memory 
footprint  of  a  function  is  high  then  the  user  will  need  to  store  intermediate  values 
in  long-term  memory  and  recall  them  mid-computation.  We  take  the  view  that 
no  human  computable  function  should  require  users  to  store  intermediate  val¬ 
ues  in  long-term  memory  because  the  memorization  process  would  necessarily 
slow  down  computation3.  Therefore,  we  can  rule  out  operations  involving  large 
numbers.  For  example,  expressions  like  98423  +  498874  mod  2345  or  54322340489 
mod  8156243869  would  be  very  difficult  —  if  not  impossible  —  for  most  humans 
to  evaluate  in  their  heads.  Most  humans  would  be  capable  of  evaluating  a  long 
expressions  like  7+1  +  6  +  0  +  8  +  3-I-4-I-7-I-2-I-7  +  8-I-9-I-5-I-3  mod  10  in  their  head 
—  after  receiving  a  few  basic  preliminary  instructions  (e.g.,  only  worry  about  re¬ 
membering  the  least  significant  digit).  However,  even  this  expression  would  take 
a  while  to  evaluate  because  it  involves  many  terms.  Thus  a  human  computable 
function  involves  1)  simple  operations  with  a  very  small  memory  footprint  2)  few 
terms,  and  3)  few  operations. 

3However,  we  do  consider  functions  that  require  users  to  retrieve  values  from  long-term  mem¬ 
ory.  For  example,  the  user  might  need  to  remember  the  value  a(i)  or  the  user  might  need  to 
remember  basic  arithmetic  facts  that  he  memorized  in  grade  school  (e.g.,  9+5=14). 
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3.3.3  Password  Unforgeability 


In  the  password  forgeability  game  the  adversary  attempts  to  guess  the  user's  pass¬ 
word  for  a  randomly  selected  account  after  he  has  seen  the  user's  passwords  at 
m  other  randomly  selected  accounts.  We  say  that  a  scheme  is  UF-RCA  (Unforge¬ 
ability  against  Random  Challenge  Attacks)  secure  if  any  probabilistic  polynomial 
time  adversary  fails  to  guess  the  user's  password  with  high  probability  In  the 
password  forgeability  game  we  select  the  secret  mapping  a  :  [ft]  — >  uniformly 

at  random  along  with  challenges  C\, . . . ,  Cmt+t  ~  X^.  The  adversary  is  given  the 
function  /  :  Z*  — >  and  is  shown  the  challenges  C\, . . . ,  Ct(m+ 1)  as  well  as  the 
values  /  (a  (Ci))  for  i  €  {t  +  1, mt  +  t}.  The  game  ends  when  the  adversary 
outputs  a  guess  (q\, ...  ,qt)  eZ'  for  the  value  of  (f  ( a  (Ci)) ,...,/  (o  (Cf))).  We  say 
that  the  adversary  wins  if  he  correctly  guesses  the  responses  to  all  of  the  challenges 
Ci, ,  Q.  Formally,  we  use 

Wins  (3K,  n,  m,  t)  =  V/  £  {1, ... ,  t}.qi  =  f  (a  (Cf) 

to  denote  the  event  that  the  adversary  wins  the  game.  We  are  interested  in  under¬ 
standing  how  many  example  single  digit  challenge-response  pairs  the  adversary 
needs  to  see  before  he  can  start  breaking  the  user's  passwords. 

Definition  7.  (Security)  We  say  that  a  function  f  :  Z|)  — >  Z,i  is  UF  -  RCA  (n,  m,  t,  b)  - 
secure  if  for  every  probabilistic  polynomial  time  (in  n,m)  adversary  SA 

Pr  [Wins  (fft,  n,  m,  t)]  <  5  , 

where  the  randomness  is  taken  over  the  selection  of  the  secret  mapping  o  ~  Z”,  the 
challenges  Ci, . . . ,  Cmt+t  as  well  as  the  adversary's  coins. 


Discussion  Our  security  model  in  this  chapter  is  different  from  the  security 
model  from  Chapter  2  in  which  the  adversary  gets  to  adaptively  select  which 
accounts  to  compromise  and  which  account  to  attack.  While  our  security  model 
may  seem  weaker  at  first  glance  because  the  adversary  does  not  get  to  select 
which  account  to  compromise/attack,  we  observe  that  the  password  management 
schemes  of  Chapter  2  are  only  secure  against  one  to  three  adaptive  breaches.  By 
contrast,  our  goal  is  to  design  human  computable  password  schemes  that  satisfy 
UF-RCA  security  for  large  values  of  m  (e.g.  100),  which  means  that  it  is  reasonable 
to  believe  that  the  user  has  at  most  m  password  protected  accounts.  If  the  user  has 
at  most  m  accounts  then  even  an  adaptive  adversary  —  who  gets  to  compromise 
all  but  one  account  —  will  not  be  able  to  forge  the  password  at  any  remaining 
account  with  probability  greater  than  mb  (typically,  m  < cl /b). 
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3.3.4  Security  Parameters  of  / 

Given  a  function  /  :  Zjj  — »  Z^  we  define  the  function  Qf  :  Z^+1  — >  {±1}  s.t. 

Q  f  (x,  i)  =  1  if  f{x)  =  z;  otherwise  Qf  (x,  i)  =  -1.  We  use  Qa  to  define  a  distribution 
over  x  Zrf  (challenge-response  pairs)  as  follows 


Pr  [C,  z] 

Qfa 


Qf(o(C),i)  +  l 
2|Xfc| 


Intuitively,  Ql  is  the  uniform  distribution  over  challenge  response  pairs  (C,  j) 
s.t.  f(a(C))  =  j.  We  also  use  Qf,;  :  Z^  — ■>  {±1}  (Qf,;  (x)  =  Qf  (x, /))  to  define  a 
distribution  over  X/t.  We  write  the  Fourier  decomposition  of  a  function  Q  :  Z^  — » 
{±1}  as  follows 

Q(z)  =  ^  (*)  / 

aeZd 

where  our  basis  functions  are 


(x)  =  exp 


-2n  V-1  (x  •  a)  \ 
d  )  ' 


We  say  that  a  function  Q  has  degree  £  if  £  =  max  jfi  (a)  a  G  Z^J  —  equivalently 
if  Q(x)  =  EiQi(x)  can  be  expressed  as  a  sum  of  functions  where  each  function 
Qi :  Zj)  — »  1R  depends  on  at  most  f  variables. 

Definition  8.  We  use  r(Q )  =  min  jid  (a)  3a  G  Z^.Qa  ^  0  A  a  +  o}  to  denote  the  dis¬ 
tributional  complexity  of  Q,  and  we  use  r(f)  =  min  }r  (qX/)  /  G  Z,fJ  to  denote  the 
distributional  complexity  of  f.  We  use 

gif)  =  minjf  G  N  U  {0}  3a  G  Z^,S  c  [k].s.t  |S|  =  £  and  fs,a  is  a  linear  function}  , 

to  denote  the  minimum  number  of  variables  that  must  be  fixed  to  make  f  a  linear  function. 
Here,  fsA  ■  Z k~e  — >  Z d  denotes  the  function  f  after  fixing  the  variables  at  the  indices 
specified  by  S  to  a.  Finally,  we  use  s(f)  =  mm{r{f)/2,g{f)  +  1}  as  our  composite  security 
measure. 


We  argue  that  a  human  computable  password  scheme  —  given  by  a  function 
/  —  is  secure  against  m  =  Q  (ns(£A  breaches.  In  particular,  we  argue  that  any 
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statistical  algorithm  needs  to  see  at  least  m  =  Q  (nr|T)/2j  challenge  response  pairs 
to  (approximately)  recover  the  secret  mapping  a.  We  also  demonstrate  that  any 
adversary  that  can  break  the  security  of  our  password  scheme  after  seeing  m 
challenge  response  pairs  can  be  used  to  approximately  recover  the  secret  mapping 
using  only  O  (m)  challenge  response  pairs. 


3.4  Statistical  Adversaries  and  Lower  Bounds 

Our  main  technical  result  (Theorem  6)  is  a  lower  bound  on  the  number  of  single 
digit  challenge-response  pairs  that  a  statistical  algorithm  needs  to  see  to  (approx¬ 
imately)  recover  the  secret  mapping  a.  Our  results  are  quite  general  and  may 
be  of  independent  interest.  Given  any  function  /  :  Zjj  — »  Z^  we  prove  that  any 

statistical  algorithm  needs  Q  examples  before  it  can  find  a  secret  mapping 

o'  G  Z”  such  that  o'  is  e-correlated  with  o.  We  first  introduce  statistical  algorithms 
in  section  3.4.1  before  stating  our  main  lower  bound  for  statistical  algorithms  in 
section  3.4.2.  We  also  provide  a  high  level  overview  of  our  proof  in  section  3.4.2. 


3.4.1  Statistical  Algorithms 

Let  D  denote  a  set  of  distributions  over  a  domain  X,  let  F  denote  a  set  of  solutions 
and  .ZG  ID  — >  2 T .  The  distributional  search  problem  [72]  Z  over  D  and  F  is 
the  following  problem:  Given  access  to  m  random  samples  from  an  unknown 
distribution  D  G  D  find  a  solution  s  G  Z  ( D )  c  F ■  For  a  solution  s  G  F  we 
will  use  Z_1  (s)  Q  D  to  denote  the  set  of  distributions  for  which  s  is  a  valid 
solution  (e.g.,  D'  G  D  s.t.  s  G  Z  ( D ')).  We  can  think  of  our  planted  constrained 
satisfiability  problem  as  a  distributional  search  problem.  For  example,  in  our 
context  X  =  XkX Zj  c  [n]k x Z^  denotes  the  set  of  all  possible  single-digit  challenge 
response  pairs,  and  our  solution  space  F  =  Z"  is  the  set  of  possible  mappings. 
Each  ijgT  defines  a  unique  distribution  Da  =  Q\,  over  challenge  response  pairs 
and  Z  ( Da )  would  denote  the  set  of  all  assignments  t  G  Z"  that  are  e-correlated 
with  o.  Now  the  distributional  search  problem  is  to  find  an  assignment  t  G  Z" 
that  is  e-correlated  with  our  planted  solution  o  (the  secret  mapping)  given  m 
challenge-response  pairs. 

A  statistical  algorithm  can  access  the  input  distribution  by  querying  the  1-MSTAT 
oracle  or  by  querying  the  VSTAT  oracle. 
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Definition  9.  [73]  [l-MSTAT(L)  oracle  and  VSTAT  oracle]  Let  D  be  the  input  distribution 
over  the  domain  X.  Given  any  function  h  :  X  — >  {0,1,. . .  ,L  -  1},  l-MSTAT(L)  takes 
a  random  sample  x  from  D  and  returns  h(x).  For  an  integer  parameter  T  >  0  and  any 
query  function  h  :  X  — >  {0,1},  VSTAT  (T)  returns  a  value  v  £  [p  -  z,p  +  t]  where 

p  =  E X~D  [ h(x)\  and  t  =  max  j^,  |. 

The  discrimination  norm  [73]  of  a  set  of  distributions  D'  relative  to  a  distribution 
D  is  denoted  by  K2(fD' ,  D )  and  defined  as  follows: 

k2(£>',D)  =  max  {ED^  [|ED,[/z]  -  ED[lz]|]}  , 

h,\\h\\D=l 


where  ||/z||D  =  -\/E x„d  [h2  (x)].  In  our  setting  D'  c  | Ql  |  a  g  Z"J  and  our  reference 
distribution  D  is  the  uniform  distribution  over  X \  X  so  we  can  write 


k2(V',D) 


max 

h,m\D=i 


where 

A  (o,h)  =  TiQ!yQf  [h(Cr  ])]  -  E (QjyXkXXd  [HQj)]  . 

Definition  10.  [73]  For  k  >  0,  q  >  0,  domain  X  and  a  search  problem  Z,  over  a  set  of 
solutions  T  and  a  class  of  distributions  D  over  X,  let  d'  be  the  largest  integer  such  that 
there  exists  a  reference  distribution  D  over  X  and  a  finite  set  of  distributions  Dd  G  D 
with  the  following  property:  for  any  solution  sef  the  set  Ds  =  Dd  \  Z^_1(s)  has  size  at 
least  (1  - 1 ])  ■  \Dd\  and  for  any  subset  D'  c  DS/  where  \D'\  >  \Ds\/d',  k2(D',D)  <  k.  The 
statistical  dimension  with  discrimination  norm  k  and  error  parameter  q  of  Z  is  d'  and 
denoted  by  SDN(Z1  k,  q). 


Feldman  et  al.  [73]  proved  the  following  lower  bound  on  the  number  of 
l-MSTAT(L)  queries  needed  to  solve  a  distributional  search  problem.  Intuitively, 
Theorem  5  implies  that  many  queries  are  needed  to  solve  a  distributional  search 
problem  with  high  statistical  dimension.  In  Section  3.4.2  we  argue  that  the  statis¬ 
tical  dimension  our  distributional  search  problem  (finding  o'  that  is  e-correlated 
with  the  secret  mapping  o  given  m  samples  from  the  distribution  Q^)  is  high. 

Theorem  5.  [73,  Theorems  10  and  12  ]  Let  Xbea  domain  and  Zbea  search  problem  over 
a  set  of  solutions  T  and  a  class  of  distributions  D  over  X.  For  k  >  0  and  q  £  (0, 1),  let 
d'  =  SDN (2,  k,  q).  Let  D  be  the  reference  distribution  and  Dd  be  a  set  of  distributions 
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for  which  the  value  d'  is  achieved.  Any  randomized  statistical  algorithm  that,  given  access 
to  a  VSTAT  (resp.  1-MSTAT  ( L))  for  a  distribution  chosen  randomly  and  uniformly 
from  Dd,  succeeds  with  probability  A  >  q  over  the  choice  of  distribution  and  internal 


randomness  requires  at  least  jffd'  (resp.  Q  £  min 


1-7/  '  K2 


l)  calls  to  the  oracle. 


As  Feldman  et  al.  [73]  observe,  almost  all  known  algorithmic  techniques  can 
be  modeled  within  the  statistical  query  framework.  In  particular,  techniques 
like  Expectation  Maximization[62],  local  search,  MCMC  optimization[83],  first 
and  second  order  methods  for  convex  optimization,  PCA,  ICA,  k-means  can  be 
modeled  as  a  statistical  algorithm  even  with  L  =  2  —  see  [37]  and  [55]  for  proofs. 
One  issue  is  that  a  statistical  simulation  might  need  polynomially  more  samples. 
However,  for  L  >  2  we  can  think  of  our  queries  to  l-MSTAT(L)  as  evaluating  L 
disjoint  functions  on  a  random  sample.  Indeed,  Feldman  et  al.  [73]  demonstrate 
that  there  is  a  statistical  algorithm  for  binary  planted  satisfiability  problems  using 
6  (nrO/2)  calls  to  1-MSTAT  (h^2 l). 

Remark  1.  We  can  also  use  the  statistical  dimension  to  lower  bound  the  number  of 
queries  that  an  algorithm  woidd  need  to  make  to  other  types  of  statistical  oracles  to 
solve  a  distributional  search  problem.  For  example,  we  could  also  consider  an  oracle 
MV  ST  AT  (L,  T)  that  takes  a  query  h  :  X  — >  [0,  ...,L  -  1}  and  a  set  S  of  subsets  of 
[0, . . . ,  L  -  1}  and  returns  a  vector  v  e  1RL  s.tfor  every  Z  e  S 


v[t\  -  pz 

ieZ 


<  max 


1  jpz  (1  ~  pz) 
T'y  T 


where  pz  =  Prx_D  [h(x)  e  Z]  and  the  cost  of  the  query  is  |<S|.  Feldman  et  al.  [73,  Theorem 
7]  proved  lower  bounds  similar  to  Theorem  5  for  the  MVSTAT  oracle.  In  this  paper  we 
focus  on  the  1-MSTAT  and  VSTAT  oracles  for  simplicity  of  presentation. 


3.4.2  Statistical  Dimension  Lower  Bounds 

We  are  now  ready  to  state  our  main  technical  result. 

Theorem  6.  Let  a  £  Z"  denote  a  secret  mapping  chosen  uniformly  at  random  and  let  Z.q / 

be  a  planted  constrained  satisfiability  problem  with  distribution  over  Xk  x  Zrf,  where 
f  has  distributional  complexity  r  =  r(f).  Any  randomized  statistical  algorithm  that  finds 


66 


an  assignment  t  such  that  t  zs 


-21n(,,/2) 


-correlated  with  a  with  probability  at  least 


A  >  rj  over  the  choice  of  a  and  the  internal  randomness  of  the  algorithm  needs  at  least 


m 


calls  to  the  1-MSTAT(L)  oracle  (resp.  VSTAT  —  —  -yr  ywith  m  ■  L  >  c\  (f^)  (resp. 


m  >  nc'  ]°sn)  for  a  constant  C\  =  Qfc  1/(A_I;)(1).  In  particular  if  we  set  L  =  then 

/  \r/2 

our  algorithms  needs  at  least  m  >  C\  calls  to  1-MSTAT(L). 


The  proof  of  Theorem  6  follows  from  Theorems  7  and  5.  Theorems  6  and  7 
generalize  results  of  Feldman  et  al.  The  results  of  Feldman  et  al.  [73]  only  apply  for 
functions  /  :  [0,  \}k  — >  [0, 1}.  An  interested  reader  can  find  our  proofs  in  Appendix 
8.2.  At  a  high  level  our  proof  proceeds  as  follows:  Given  any  function  h  :  Xk  — >  IR 
we  show  that  A  ( o ,  h)  can  be  expressed  in  the  following  form: 


where  \X{\  =  ©  (V  j  and  each  function  be  has  degree  t  (Lemma  5).  We  then  use  the 
general  hypercontractivity  theorem  [116,  Theorem  10.23]  to  obtain  the  following 
concentration  bound. 


Lemma  3.  Let  b  :  Z'J  IR  be  any  function  with  degree  at  most  l,  and  let  S  c  Z”  be  a 
set  of  assignments  for  which  d'  =  dn  /  \S\  >  ef.  Then  [I  b  (cr)|]  <  2{lndJfo)  1  \\b\\2r  where 

Co  =  €  (^)  and  \\b\\2  =  b  {xf  . 


We  then  use  Lemma  3  to  bound  E,^  [A  (a,  h)\  for  any  set  <S  c  Z”  such  that 
\S\  =  |z^|  Id’  (Lemma  8).  This  leads  to  the  following  bound  on  K2CD' ,  14)  = 
Ok((\nd'l/nff)l2). 

Theorem  7.  Let  Z.q,£  denote  the  problem  of  finding  for  every  o  e  Z ",  an  assignment 
t  G  Z^  that  is  e-correlated  with  o  given  access  to  distribution  over  Xk  x  Z d-  Then 
there  exists  a  constant  Cq  >  0  such  that  for  any  e  >  1/  yfn  and  q  >  n, 


SDN 


Z, 


Q,er 


cq  (log?) 

nr/2 


r/2 


-,2G"'e2/2 


>q 


where  r  =  r(f)  is  the  distributional  complexity  of  f. 
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3.5  Security  Analysis 


In  this  section  we  analyze  the  security  of  a  human  computable  password  scheme 
using  our  statistical  query  lower  bounds  as  a  building  block.  In  section  3.5.1  we 
show  that  any  adversary  that  breaks  UF-RCA  security  can  also  (approximately) 
recover  the  secret  mapping  a.  As  we  showed  in  Section  3.4  statistical  algorithms 
need  at  least  m  =  Q  (n'A)/2)  challenge-response  pairs  to  recover  the  secret  map¬ 
ping.  This  implies  that  no  statistical  adversary  can  break  UF-RCA  security.  This 
is  significant  because  most  known  algorithmic  techniques  can  be  modeled  within 
the  statistical  query  framework.  While  Gaussian  Elimination  is  a  notable  excep¬ 
tion,  we  show  that  an  adversary  needs  m  =  Q  (;irC)/2j  challenge-response  pairs  to 
recover  a  using  Gaussian  Elimination  in  Section  3.5.2  . 

3.5.1  Breaking  UF-RCA  is  Equivalent  to  Secret  Recovery 

Theorem  6  only  establishes  that  it  is  hard  for  a  statistical  adversary  to  properly 
learn  the  secret  mapping  o.  Could  an  adversary  win  our  password  security  game 
without  properly  learning  the  secret  mapping?  In  learning  theory  it  is  NP-hard  to 
find  a  2-term  DNF  that  is  consistent  with  a  given  dataset.  However,  just  because 
2-DNF  is  hard  to  learn  in  the  proper  learning  model  does  not  mean  that  learning 
2-DNF  is  hard.  Indeed,  if  we  allow  our  learning  algorithm  to  output  a  linear 
classifier  instead  of  a  2-term  DNF  then  2-DNF  is  easy  to  learn  [96].  Of  course, 
for  some  functions  it  is  very  easy  to  predict  challenge-response  pairs  without 
learning  o.  For  example,  if  /  is  the  constant  function  —  or  any  function  highly 
correlated  with  the  constant  function  —  then  it  is  easy  to  predict  the  value  of 
/  (<j  (C)).  However,  any  function  that  is  highly  correlated  with  a  constant  function 
is  a  poor  choice  for  a  human  computable  passwords  scheme.  We  argue  that  any 
adversary  that  can  win  the  password  game  can  be  converted  into  an  adversary  that 
properly  learns  a  provided  that  our  function  /  has  certain  reasonable  properties. 

Definition  11.  We  say  that  a  function  f  is  (61,62) — hard  to  predict  if  Vo,  o'  £  Z"  s.t.  o 
is  b\-balanced  and  o'  is  not  5i-correlated  with  o  we  have 

Pr  [/(a(C))  =  /(a'(C))]<i+S2. 

C~A/c  U 

Intuitively,  Definition  11  says  that  if  o  is  approximately  balanced  (e.g.,  for  each 
i  £  Ztf  the  string  o  contains  ~  n/d  i's)  and  o'  is  not  61-correlated  with  o  then  /  (o'  (C)) 
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is  not  a  good  predictor  of  /  (a  (C)).  We  note  that  if  /  is  highly  correlated  with  a 
constant  function  then  /  (o'  (C))  will  always  be  a  good  predictor  of  /  (cr  (C)). 

Corollary  1  says  that  any  statistical  adversary  needs  to  see  at  least  Q  ex¬ 

ample  challenge  response  pairs  before  it  can  accurately  guess  the  value  of  /  (o  (C)) 
for  a  randomly  chosen  challenge  C  £  X*.  Corollary  1  follows  easily  from  Theorems 
8  and  6. 

Theorem  8.  Let  f  be  (b\,  §2) — hard  to  predict,  let  o  ~  Z"  denote  the  secret  mapping,  let 
e  >  0  be  any  constant  and  suppose  that  we  are  given  labels  £c  £  Zrf/or  eucry  C  £  Xk  s.t 

Pr  [f(o(Q)  =  €c]>l  +  62  +  e. 

C~X/f  CL 

There  is  a  polynomial  time  algorithm  (in  n, 1/e, 1/62)  that  with  high  probability  finds  a 
mapping  o'  £  Z"  such  that  o'  is  6\-correlated  with  o  provided  that  o  is  b\-balanced. 

Corollary  1.  Let  e  >  0  be  any  constant  and  let  be  any  statistical  adversary  that 

outputs  labels  £cfor  each  clause  C  £  Xk  after  making  at  most  m  =  0  j  ciuer^es  t° 

1-MSTAT  (nrC)/2).  Iff  is  ( ,  S2) — hard  to  predict  then  with  probability  1  -  o(l)  we  have 

Pr  [/  (a  (Q)  =  £c\  ^  3  +  62  +  e  . 

cr~Z|j  W 

c~xt 


We  will  briefly  overview  the  proof  of  Theorem  8  here  —  see  Appendix  8.4 
for  more  details.  We  first  randomly  partition  [;/]  into  n/z  parts  Si,  ...,S„/T  where 
t  =  O  (log  n).  For  each  set  Su  we  can  check  all  of  the  mappings  ol  (S;)  £  Z^  to  find 
the  one  that  is  consistent  with  the  most  noisy  labels  £q  for  each  C  £  S,.  With  high 
probability  for  each  set  S,  we  have 

Pr  [tc  =  f(o  (O)  |  C  c  Si\  *  Pr  =  /  (o  (C))]  . 

Because  /  is  (b\,  §2) — hard  to  guess  this  means  that  for  each  i  the  strings  o\  (S,)  £  Z^ 
and  o  (Sf  £  Z^  are  <52-correlated.  We  combine  each  of  the  o\  mappings  to  construct 
a  mapping  o’  £  Z"  that  is  62-correlated  with  a. 

Theorem  9  further  extends  our  argument.  Any  adversary  TA  that  can  win  our 
security  game  could  be  used  in  a  blackbox  manner  to  recover  a  mapping  o'  that  is 
highly  correlated  with  the  secret  mapping  o.  Theorem  9  implies  that  no  statistical 
adversary  can  break  the  security  of  our  human  computable  password  schemes 
with  0  examples. 
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Theorem  9.  Suppose  that  f  is  (b\,  §2) — hard  to  predict,  but  that  f  is  not  UF-RCA  ( n ,  m,  t,  5)- 
securefor  5  >  (^  +  52  +  ej  .  Then  there  is  a  probabilistic  polynomial  time  algorithm  (in  n, 
m,  l/Si,  I/62,  1/e)  that  extracts  a  string  o'  £  Z"  that  is  c-corr elated  with  o  after  seeing 
O  (m)  examples,  where  c  >  0  is  a  constant. 

To  prove  Theorem  9  we  first  show  how  to  use  the  adversary  as  a  blackbox 
to  generate  (noisy)  predictions  £c  for  every  clause  C  e  Xk.  By  Lemma  8  we  can 
use  these  predictions  to  find  a  mapping  o'  that  is  highly  correlated  with  the  secret 
mapping  o. 

We  use  dRc1,...,cm  '■  (Xk)1  —■ *  Z^  to  denote  an  adversary  who  sees  examples 
Ci, ,  Cm  £  Xk  and  /  (Q) ,...,/ (Cm).  3lCl . ,cm  (C', . . . ,  C[)  €  Z^  denotes  the  ad¬ 
versaries  prediction  of  /  (a  ,f(o  Given  a  function  b  :  (Xfc)f  — >  Z^, 

challenges  C', . . . ,  C't  £  Xk  and  responses  /  [o  f  [o  we  use  Pb,i,c\,...,cm  '■ 

Xkx\t]  — »  Zrf  U  { _L}  to  predict  the  value  of  a  clause  C  £  Xk 

p,  c.  r,  <c  0  =  h  G . c‘) 1,1  if  /  0  O)  =  KC> . e<)  w v/  <  '■ 

r f  |.L,  otherwise 

where  C,  =  C  and  Cj  =  C'  for  /  ^  i.  We  allow  our  predictor  Vb,cy...,ct  (C,  i)  to  output 

±  when  it  is  unsure.  Informally,  Claim  1  says  that  for  £>  =  jZtc, _ c,„  our  predictor 

fPb,i,c\,...,cm  is  reasonably  accurate  whenever  it  is  not  unsure.  The  proof  of  Claim  1 
can  be  found  in  Appendix  8.4.  Briefly,  Claim  1  follows  because  for  b  =  dRclt...,cm  we 
have 


u 

Pr  [Wins  (Jl,  n,  m,  t)]  =  Pr  \Pb,c\ . c;  (C,  i)  =  f  (o  (C»  |  rPh,c\ . q  (C,  i)  4  ±] 


Pr 

c~xk 

1=1  Cu..;Cm~Xk 

q . ct~xk 


Claim  1.  Let  be  an  adversary  s.t  Pr  [Wins  (TR,  n,  m,  t)]  >  Q  +  6  +  e)  and  let  b  = 
4ZtCl . cm  then 


Pr  fp; 

i~[t],C~Xk  L 

Q,— ,cm~xk 
q . c'~xk 


(C,  i )  -  f  (o  (C))  cPb,cv...,c\  (C,  i)  ^  ±j  >  +  6  +  e 
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Now  we  can  select  a  random  index  ic  ~  [f]  for  each  clause  C  €  Xk/  and  set 
tc  =  tPb,c1,...,c't  (C  ic)  whenever  Pb,cl,...,ct  (C  ic)  ^  -1-  The  remaining  challenge  is  that 
we  need  to  label  for  all  of  the  clauses  C  £  Xk  before  we  can  apply  Theorem  8.  To 
ensure  that  all  clauses  are  labeled  we  construct  multiple  independent  predictors. 
Notice  that  for  each  clause  C  £  Xk  the  probability  that  Pt,,c  (C,  /c)  ^  -L  is  at  least 
1/t  (the  probability  that  ic  =  1). 


3.5.2  Gaussian  Elimination 

Most  known  algorithmic  techniques  can  be  modeled  within  the  statistical  query 
framework.  Gaussian  Elimination  is  a  notable  exception.  As  an  example  con¬ 
sider  the  function  f(x . . .  ,x7)  =  X\  +  . . .  +  X7  mod  10  (in  this  example  r(/)  =  7 
and  g(f)  =  0).  Our  previous  results  imply  that  any  statistical  algorithm  would 
need  to  see  at  least  m  =  Q  ( n7/ 2)  challenge  response  pairs  (C,  /  (a  (C)))  to  recover 
a.  However,  it  is  trivial  to  recover  a  from  O(n)  random  challenge  response 
pairs  using  Gaussian  Elimination.  In  general,  consider  the  following  attacker 
shown  in  algorithm  3.1,  which  uses  Gaussian  Elimination.  Algorithm  3.1  relies 
on  the  subroutine  TryExtract  (C,/ (a  (C))S,  a),  which  attempts  to  extract  a  linear 
constraint  from  (C,  f  (o  (C)))  under  the  assumption  that  cr(S)  =  a.  We  assume 
TryExtract  (C,  /  ( 0  (C))  S,  a)  returns  0  if  it  cannot  extract  a  linear  constraint. 


Algorithm  3.1  GaussianAttack 

Input:  Clauses  C1/ . . . ,  Cm  ~  Xk,  and  labels  /  (a  (Q)) ,...,/  (a  (Cm)). 
for  all  S  £  Xg{f),  a  £  Z  f]  do 

LC  <—  0  >  LC  is  the  set  of  linear  constraints  extracted 

for  all  C  £  {Ci, . . . ,  Cm\  do 

LC  <—  LC  U  TryExtract  (C,  /  ( 0  (C)) ,  S,  a) 

if  |LC|  >  n  then 

o'  <—  LinearSolve  (LC) 

if  Vz  £  [m\.  f  (o'  (Ci))  =  f  (o(Ci))  £  C  then  return  o' 


Fact  1  says  that  an  attacker  needs  at  least  m  =  Q  (w1  W/)j  challenge-response 
pairs  to  recover  o  using  Gaussian  Elimination.  This  is  because  the  probability 

that  TryExtract  (C,  /  (o  (C))  S,  a)  extracts  a  linear  constraint  is  at  most  O 

which  is  O  for  |S|  constant.  The  adversary  needs  0(n)  linearly  independent 
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constraints  to  run  Gaussian  Elimination.  If  the  adversary  can  see  at  most  O  \ns^ j 
examples  neither  approach  (Statistical  Algorithms  or  Gaussian  Elimination)  can 
be  used  to  recover  o. 

Fact  1.  Algorithm  3.1  needs  to  see  at  least  m  =  Q  (hi+<?(/)^  challenge-response  pairs  to 
recover  o. 

Remark  2  explores  the  tradeoff  between  the  adversary's  running  time  and  the 
number  of  challenge-response  pairs  that  an  adversary  would  need  to  see  to  recover 
a  using  Gaussian  elimination.  In  particular  the  adversary  can  recover  a  from 
O  (V1  W/)/2j  challenge-response  pairs  if  he  is  willing  to  increase  his  running  time 

by  a  factor  of  d^".  In  practice,  this  attack  maybe  reasonable  for  n  <  100  and  d  =  10, 
which  means  that  it  may  be  beneficial  to  look  for  candidate  human  computable 
functions  /  that  maximize  min{r(/) /2, 1  +  g(f) /2}  instead  of  s(f)  whenever  n  <  100. 

Remark  2.  If  the  adversary  correctly  guesses  value  of  o  (S)  for  |S|  =  rf  then  he  may  be 
able  to  extract  a  linear  constraint  from  a  random  example  with  probability  Q(l/n(1-e)s(/))_ 
The  adversary  would  only  need  O  (n1+(1_e)s(/)^  examples  to  solve  for  o,  but  his  running 
time  would  be  proportional  to  den  —  the  expected  number  of  guesses  before  he  is  correct. 


3.6  Candidate  Secure  Human  Computable  Functions 

For  all  of  our  candidate  human  computable  functions  /  :  Zj)  — >  Z,f  we  fix  d  =  10 
because  most  humans  are  used  to  performing  arithmetic  operations  on  digits.  A 
good  human  computable  function  should  balance  security  and  usability.  A  secure 
human  computable  function  should  have  r(f)  and  g(f)  large.  This  makes  it  chal¬ 
lenging  to  simultaneously  achieve  usability  because  usable  human  computable 
function  should  only  require  the  user  to  perform  a  few  simple  operations  to  eval¬ 
uate  /.  We  present  two  candidate  human  computable  functions  and  analyze  their 
security  parameters.  We  consider  the  usability  of  our  human  computable  pass¬ 
word  schemes  by  (1)  discussing  ways  that  the  secret  mapping  could  be  memorized 
easily,  (2)  analyzing  the  extra  effort  that  a  user  needs  to  spend  rehearsing  to  re¬ 
member  the  secret  mapping  o,  and  (3)  estimating  the  time  it  would  take  a  human 
user  to  compute  a  password.  Algorithms  3.2  and  3.3  illustrate  the  authentication 
process.  To  protect  users  from  offline  attacks  in  the  event  of  a  server  breach,  pass¬ 
words  should  be  stored  using  a  slow  cryptographic  hash  function  H  like  BCRYPT 
[122],  Servers  could  also  use  GOTCEIAs  (Chapter  6)  or  EIOSPs  [51]  for  additional 
protection. 
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(a)  Md, 2 


(b)  Md,  9 


Figure  3.1:  Mnemonics  to  help  memorize  the  secret  mapping  a 


Mnemonics  to  help  memorize  a  In  practice,  we  envision  that  the  user  memorizes 
a  mapping  from  n  objects  (e.g.,  images)  to  digits.  For  example,  if  n  =  26  and  d  =  10 
then  the  user  might  memorize  a  mapping  from  characters  to  digits.  To  memorize 
the  mapping  o(D)  =  2  we  might  show  a  visually  inclined  user  an  animation  of 
the  letter  D  transforming  into  a  2  (see  Figure  3.1a).  If  instead  o(D)  =  9  then  we 
would  show  the  user  a  different  animation  (see  Figure  3.1b).  One  nice  feature  of 
this  approach  is  that  we  only  need  to  generate  nd  illustrations  to  help  our  users 
memorize  any  mapping  (e.g.,  to  help  users  memorize  any  mapping  from  characters 
to  digits  we  would  need  just  260  such  illustrations  —  10  for  each  character). 


Algorithm  3.2  CreateChallenges 

Input:  ft,  m,  base  d,  random  bits  b,  images  I\,  and  mnemonic  helpers  M(// 
for  i  G  [ft],  j  G  {0, . . . ,  d  —  1 } . 

>  Generate  and  Memorize  Secret  Mapping 

for  i  =  1  — >  ft  do 

o(i)  ~  [0, 1}  %Using  random  bits  b 

Mj  <  AF;/Cf(;') 

(User)  Using  M,  memorizes  the  association  {l,,  a  (/))  for  i  G  [ft], 

>  Generate  Challenges 

for  i  -  1  — >  m  do 
for  j  =  1  — >  t  do 

C)  ~  Xk 

C,  <—  (C\, . . . ,  C'f'j  >  H  is  a  strong  cryptographic  hash  function 

(User)  Computes  (qlf  f(a  (c,)) 

(Server  i)  Stores  ht  =  H  (Ci,  (qi, ,  qt)j 
return  C\, . . . ,  Cm 
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Algorithm  3.3  Authenticate 

Input:  Security  parameter  t.  Account  i  €  [m\.  Challenges  C\, . . . ,  Cm. 

(c\, . . . ,  C'^j  <—  Q  >  Display  Single  Digit  Challenges 

for  j  =  1  — >  t  do 

(Semi-Trusted  Computer)  Displays  C'  to  the  user. 

(User)  Computes  c/j  <—  /  (cr  (c|^. 

(Semi-Trusted  Computer)  Sends  (q\r . . . ,  qt)  to  the  server  for  account  i. 

(Server)  Verifies  that  H  (Q,  (i q lf . qt)j  =  Jq 

3.6.1  Candidate  Scheme  1 

Our  first  candidate  human  computable  password  scheme  uses  the  function 

fl(Xo,Xi,X2,X3,X4:,...,Xi3)=Xi3+Xi2+X(Xn+Xlo  mod  10)  mod  10  . 

Claim  2  and  Theorems  6  and  9  provide  strong  evidence  that  an  adversary  will  need 
to  see  Q  (n15^j  example  challenge-response  pairs  before  he  can  recover  the  secret 
mapping  o  or  begin  to  forge  the  user  7s  passwords.  A  formal  proof  of  Claim  2  can  be 
found  in  the  appendix.  We  first  observe  that  to  influence  the  value  of  fi(x0/ . . . ,  X\j) 
we  must  fix  the  values  of  Xu,Xi3  and  at  least  one  x,  for  i  €  Zi0.  Similarly,  we 
must  fix  the  values  of  Xio  and  Xn  to  make  the  resulting  function  linear.  Therefore, 
K/i)  =  3  and  y(/i)  =  2. 

Claim  2.  r(f1)  =  3,  g(/i)  =  2  and  s(/i)  =  3/2. 

The  proof  of  fact  2  can  be  found  in  the  appendix. 

Fact  2.  /i  is  (0.01, 0.045) — hard  to  predict. 

Remark  3.  Claim  2  and  Theorem  6  imply  that  any  statistical  adversary  needs  to  see 
Q  example  challenge-response  pairs  to  recover  o  in  the  human  computable  password 
scheme  given  by  f\.  Fact  2  and  Corollary  1  demonstrate  that  a  statistical  adversary  needs 
at  least  Q  (n3/2)  example  challenge  response  pairs  before  it  can  guess  the  response  to  a 
random  challenge  with  probability  >  0.145.  Theorem  9  provides  strong  evidence  that  f  is 
UF  -  RCA  (n,  m,  t,  5)  -  secure  for  m  =  Q  (n3/2)  and  5  >  0.145f. 
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3.6.2  Candidate  Scheme  2 


Our  second  candidate  human  computable  password  scheme  uses  the  function 

fl  (*0,  *1/  *2 ,  *3/  *4/  •  •  •  /  *13)  -  *13  +  *12  +  *11  +  *(*10  mod  10)  mod  10  . 

Claim  3  and  Theorem  6  provide  strong  evidence  that  an  adversary  will  need  to  see 
Q  (ft1-5)  example  challenge-response  pairs  before  he  can  recover  the  secret  mapping 
o  or  begin  to  forge  the  user's  passwords.  The  proof  of  Claim  3  is  very  similar  to 
the  proof  of  Claim  2.  To  influence  the  value  of  /i(x0, . . . ,  X| 3)  we  must  fix  the  values 
of  *ii,  *12/ *13  and  at  least  one  Xj  for  i  £  Z10.  Similarly,  we  must  fix  the  value  of  xw 
to  make  the  resulting  function  linear.  Therefore,  gif 2)  =  1  and  riff)  =  3. 

Claim  3.  riff)  =  4,  giff)  =  1  and  s(/2)  =  2. 

Fact  3.  /2  is  (0.01, 0.01) — hard  to  predict. 

Remark  4.  Claim  2  and  Theorem  6  imply  that  any  statistical  adversary  needs  to  see  Q  (ft2) 
example  challenge  response  pairs  to  recover  a  in  the  human  computable  password  scheme 
given  by  f2.  Fact  2  and  Corollary  1  demonstrate  that  a  statistical  adversary  needs  at 
least  Q  (ft2)  example  challenge  response  pairs  before  it  can  guess  the  response  to  a  random 
challenge  with  probability  >  0.11.  Theorem  9  provides  strong  evidence  that  f\  is  in,  m,  t,  5) 
for  m-  Cl  in2)  and  5  >  0.1V. 


3.6.3  Usability: 

We  analyze  usability  along  two  dimensions:  (1)  the  extra  effort  required  for  the 
user  to  memorize  and  rehearse  the  secret  mapping  0,  and  (2)  the  time  that  it  takes 
the  user  to  compute  his  password  when  he  wants  to  login. 


Memorizing  and  Rehearsing  0 

We  adopt  the  usability  model  from  Chapter  2.4  to  quantify  the  extra  effort  that  a 
user  would  need  to  spend  rehearsing  the  mapping  0  (the  results  are  summarized 
in  Table  3.1).We  quantify  usability  by  calculating  E  [XR365],  the  expected  number 
of  extra  rehearsals  that  the  user  will  be  required  to  do  to  remember  the  secret 
mapping  0  during  the  first  year. 
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Review.  Suppose  that  the  user  has  m  accounts  A1,...,  Am .  Recall  that  a  visitation 
schedule  for  an  account  A,  is  a  sequence  of  real  numbers  t],  <  t]  <  . . .,  which 
represent  the  times  when  the  account  A,  is  visited  by  the  user.  Recall  that  a  rehearsal 
requirement  [h,  h+|)  for  a  cue-association  pair  (c,  a)  can  be  satisfied  naturally  if  the 

user  visits  a  site  Aj  that  uses  the  cue  c  (c  G  c/'j  during  the  given  time  window. 
Here,  cy  denote  the  set  of  cue-association  pairs  that  the  user  must  remember  when 
logging  into  account  Aj.  In  our  case  the  user  must  remember  the  cue-association 
pairs  ( i,o(i ))  for  each  i  G  [ft]. 

Example:  Consider  the  human  computable  function  f\  from  Section  3.6,  and 
suppose  that  the  user  has  to  compute  f\  (a  (C,))  to  authenticate  at  account  Aj, 
where  C,  =  (x0, . .  .  ,Xi3).  When  the  user  computes  f\  he  must  rehearse  the  as¬ 
sociations  (xio,  o  (xio)),  (xn,cr  (xn)),  (X12,  a  (x12)),  (x13ro(x13))  and  (xuo(xj))  where 
i  =  (a  (xio)  +  cr(xn)  mod  10).  Thus  Cj  D  {x;,Xio,Xn,Xi2,Xi3}.  When  user  authenti¬ 
cates  he  naturally  rehearses  each  of  these  associations  in  c;. 


Evaluating  Usability  Given  a  sufficient  rehearsal  schedule  and  a  visitation  sched¬ 
ule,  Theorem  1  predicts  the  value  of  XRtr  the  total  number  of  extra  rehearsals  that 
a  user  will  need  to  do  to  remember  all  of  the  cue-association  pairs  required  to 
reconstruct  all  of  his  passwords  for  t  days.  We  use  the  formula  from  Theorem  1 
to  obtain  the  usability  results  in  Table  2.3.  To  evaluate  this  formula  we  need  to 
be  given  the  rehearsal  requirements,  a  visitation  schedule  (A;)  for  each  account 
A,  and  a  set  of  public  challenges  Q  G  (X14)10  for  each  account  A;.  The  rehearsal 
requirements  are  given  by  the  Expanding  Rehearsal  Assumption  from  Chapter 
2.4  (we  use  the  same  association  strength  parameter  a  =  1),  and  the  visitation 
schedules  for  each  user  are  given  in  Table  2.1.  We  assume  that  each  password  is 

10  digits  long  and  that  the  challenges  C,  G  (X14)  are  chosen  at  random  by  algo¬ 
rithm  2.2.  Notice  that  each  time  the  user  responds  to  a  single  digit  challenge  he 
rehearses  the  secret  mapping  at  five  locations  (see  section  3.6.3).  Because  the  value 
of  IE  [XR365]  depends  on  the  particular  password  challenges  that  we  generated  for 
each  account,  we  ran  Algorithm  3.2  and  computed  the  resulting  value  E  [XR365] 
one-hundred  times.  The  values  in  Table  2.3  represent  the  mean  value  of  E  [XR365], 


Discussion  One  of  the  advantages  of  our  human  computable  passwords  schemes 
is  that  memorization  is  essentially  a  one  time  cost  for  our  Very  Active,  Typi¬ 
cal  and  Occasional  users.  That  is  once  the  user  has  memorized  the  mapping 
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Our  Scheme  [0  e  Z”0) 

Shared  Cues 

User 

n  =  100 

n  =  50 

n  =  30 

SC-0 

SC-1 

SC-2 

Very  Active 

0.396 

0.001 

«  0 

*  0 

3.93 

7.54 

Typical 

2.14 

0.039 

*  0 

*  0 

10.89 

19.89 

Occasional 

2.50 

0.053 

*  0 

*  0 

22.07 

34.23 

Infrequent 

70.7 

22.3 

6.1 

*  2.44 

119.77 

173.92 

Table  3.1:  E  [XR365]:  Extra  Rehearsals  over  the  first  year  to  remember  a  in  our 
scheme.  Compared  with  Shared  Cues  schemes  SC-0,SC-1  and  SC-2[33]. 


A 

B 

C 

D 

0 

E 

5 

J 

1 

F 

6 

K 

2 

G 

7 

L 

3 

H 

8 

M 

4 

I 

9 

N 

Table  3.2:  Single-Digit  Challenge  Tayout  in  Scheme  1 


a  :  {1, ...,  n}  — »  Zb  he  will  get  sufficient  natural  rehearsal  to  maintain  this  memory. 
With  the  exception  of  SC-0  (the  least  secure  Shared  Cues  scheme),  our  schemes  re¬ 
quire  the  user  to  expend  less  extra  effort  rehearsing  his  secret  mapping.  Intuitively, 
this  is  because  human  computable  password  schemes  give  the  user  more  op¬ 
portunities  to  naturally  rehearse  o  during  the  authentication  process.  To  compute 
/1  (0  ({1, . . . ,  14}))  the  user  would  need  to  recall  the  values  of  cr(ll),  cr(12),  cr(13),  u(14) 
and  a  (1  +  (<r(ll)  +  <r(12)  mod  10)).  If  the  user  has  10  digit  passwords  then  he  will 
naturally  rehearse  the  value  of  o  at  up  to  fifty  different  locations  each  time  he 
computes  one  of  his  passwords.  The  disadvantage  is  that  the  user  needs  to  spend 
extra  time  computing  his  password  each  time  he  authenticates. 


Computation  Time 

To  help  the  user  compute  the  response  to  a  single  digit  challenge  C  more  quickly 
a  semi-trusted  computer  could  display  the  challenge  in  a  more  helpful  manner. 
For  example,  the  challenge  C  =  (E,  F,  G,  H,  I,  ],  K,  L,  M,  N,  A,  B,  C,  D)  e  X14  might  be 
displayed  to  the  user  as  in  Table  3.2.  Now  to  compute  /  ( o  (C))  in  scheme  1  the 
user  would  execute  the  following  steps  (1)  Recall  o(A)  —  the  number  associated 
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with  the  letter  A,  (2)  Recall  o(B),  (3)  Compute  i  =  a  (A)  +  o(B)  mod  10  —  without 
loss  of  generality  suppose  that  i  =  8,  (4)  Find  the  letter  at  index  i — M  if  i  =  8,  (5) 
Recall  <j(M)  (6)  Recall  o(C)  (8)  Compute  j  =  o(M)  +  o(C)  mod  10  (9)  Recall  o(D) 

(10)  Return  j  +  o(D)  mod  10. 

Notice  that  the  computation  at  each  step  only  relies  on  values  from  the  last  two 
steps  so  we  do  not  require  the  user  to  keep  more  than  7  chunks  of  information  in 
active  memory  [108].  Thus,  in  scheme  1  the  user  can  compute  his  response  to  a 
single-digit  challenge  in  10  mental  steps,  and  it  would  take  10/*  steps  to  respond 
to  a  length-/*  password  challenge. 

We  timed  ourselves  to  determine  how  long  each  scheme  took  one  of  the  authors 
to  evaluate.  After  the  first  author  had  memorized  the  secret  mapping  it  took  him 
t  =  7.5  seconds  on  average  to  respond  to  compute  the  response  f(o(C ))  to  a 
random  challenge  C  €  Xjt  in  both  schemes.  Thus,  our  schemes  are  7.5-human 
computable  for  at  least  some  human  users  so  it  would  take  75  seconds  to  compute 
a  10  digit  password  using  this  scheme4. 

3.6.4  Statistical  Algorithms:  Security  Upper  Bound 

Theorem  10  demonstrates  that  our  lower  bound  for  statistical  algorithms  are 
asymptotically  tight  for  both  of  our  human  computable  password  schemes.  In 
particular,  we  demonstrate  that  m  =  O  (ftrb)/2j  queries  to  1-MSTAT  are  sufficient 
for  a  statistical  algorithm  to  recover  a. 

Theorem  10.  For  fi  e  {/i,  f2j  there  is  a  randomized  algorithm  that  makes  O  (Vzmax{|/'W/2}  iQg2  n 
calls  to  the  1-MSTAT  (Vrdi)/21  j  oracle  and  returns  a  with  probability  1  -  o(l). 

For  binary  functions  f  :  {0,  l}k  — >  {0,1},  Feldman  et  al.  [73]  gave  a  random¬ 
ized  statistical  algorithm  to  find  o'  £  [0, 1}"  using  just  O  (nr^^2  log2  n^j  calls  to  the 

1-MSTAT  (Vrb)/21  j  oracle.  Their  main  technique  is  a  discrete  spectral  iteration  pro¬ 
cedure  to  find  the  eigenvector  (singular  vector)  with  the  largest  eigenvalue  (singu¬ 
lar  value)  of  a  matrix  M  sampled  from  a  distribution  Ma>iP  over  |^lk/)/2J  |  x  |^p(/)/2i| 
matrices.  With  probability  1  -  o(l)  this  eigenvector  will  encode  the  value  o'  (C)  for 
each  clause  C  €  Xr(/)/2.  We  show  that  the  discrete  spectral  iteration  algorithm  of 

4Admittedly,  we  may  not  be  a  representative  sample  for  an  average  human  user,  but  we  would 
argue  that  this  is  at  least  a  reasonable  approximation  of  the  average  member  of  the  computer 
science  community 
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Feldman  et  al.  [73]  can  be  extended  to  recover  a  G  Z10  when  /  e  {f\,  fi)  is  one  of 
our  candidate  human  computable  functions.  See  Appendix  8.4.1  for  more  details. 


Discussion  We  note  that  Theorem  10  cannot  be  extended  to  arbitrary  func¬ 
tions  /  :  — >  Zrf.  Consider  for  example  the  unique  function  /  :  Z!|0  — > 

Z10  s.t.  f(x x6)  =  f'(x1  mod  2, . . .  ,x6  mod  2)  mod  2  and  f(x1/...,x6)  = 
f"  (x  i  mod  5,  ...,x6  mod  5)  mod  5,  where  /'  :  Z^  — »  Z2  and  /"  :  Z^  — >  Z5. 
By  the  Chinese  Remainder  Theorem  instead  of  picking  a  secret  mapping  o  G  Z”Q 
we  could  equivalently  pick  the  unique  secret  mappings  0\  G  Z"  and  cr2  G  Z"  s.t 
a  =  ui  mod  2  and  0  =  02  mod  5.  Now  drawing  challenge  response  pairs  from 
the  distributions  is  equivalent  to  drawing  challenge-response  pairs  from  the 

rr  r// 

distributions  QJai  and  Qa2 .  Suppose  that  f'(x  1, . . .  ,x6)  =  X\X2  +  x3  +  x4  +  x5  +  x6 
mod  2,  and  f"(x . . . ,  Xe)  =  X\.  Then  we  have  r(f)  =  min  ( r(f ),  r(J"))  =  r(f ")  =  1, 
but  r(f')  =  4.  We  can  find  02  using  O  (n  log2  nj  calls  to  l-MSTAT(n),  but  to  find  o 
we  must  first  recover  0\,  which  requires  Q  (n'i/')/2^  =  Q  ( n 2)  calls  to  1-MSTAT  (n2). 


3.7  Discussion 

3.7.1  Human  Computable  Passwords  Challenge 

Our  security  lower  bounds  are  asymptotic  (e.g.,  an  adversary  needs  to  see 
m  -  Cl  challenge-response  pairs  to  forge  passwords),  but  in  our  con¬ 

text  the  constants  are  very  important.  To  better  understand  the  exact  security 
bounds  in  our  scheme  we  created  several  public  challenges  to  break  our  can¬ 
didate  human  computable  password  schemes  under  different  parameters  (see 
Table  8.1).  The  challenges  can  be  found  athttp ://www.  cs .  emu.  edu/~  jblocki/ 
HumanComputablePasswordsChallenge/challenge  .htm.  For  each  challenge  we 
selected  a  random  secret  mapping  o  G  Z"Q,  and  published  (1)  m  single  digit 
challenge-response  pairs  (Q,/  (cr  (Ci))),. . .,  ( Cm,f(o  (C ,„))),  where  each  clause  Q  is 
chosen  uniformly  at  random  from  Xk,  and  (2)  20  length — 10  password  challenges 
Ci, . . . ,  C20  G  (Xjt)  .  The  goal  of  each  challenge  is  to  correctly  guess  one  of  the 
secret  passwords  p,-  =  for  some  i  G  [20].  More  details  can  be  found  in 

Appendix  8.1. 
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3.7.2  Security  Under  Continuous  Leakage 


Consider  the  following  scenario:  the  adversary  infects  the  user's  computer  with 
malware  which  is  never  detected.  Every  time  the  user  computes  a  password  in 
response  to  a  challenge  the  adversary  observes  the  password  in  plaintext.  One 
way  to  protect  the  user  in  this  extreme  scenario  would  be  to  generate  multiple  (e.g., 
106)  one-time  passwords  for  each  of  the  user's  accounts.  While  usability  concerns 
make  this  approach  infeasible  in  a  traditional  password  scheme  (it  would  be  far 
too  difficult  for  the  user  to  memorize  a  million  one-time  passwords  for  each  of 
his  accounts),  it  may  be  feasible  to  do  this  using  a  human  computable  password 
scheme.  When  we  initially  generate  the  secret  mapping  o  ~  Z?#,)0  we  could  also 

generate  cryptographic  hashes  for  multiple  one-time  passwords  H  (c,f 3  (a  (c))). 

We  conjecture  that  the  following  candidate  human  computable  password  scheme 
/3  could  be  used  to  provide  security  even  in  this  extreme  scenario 


fz  (x0,  X1  /  x2 ,  xZr  x4/  •  •  •  /  *31)  - 


(  31 

E 

1=21 


X; 


+  X, 


(Lflw  Xi  mod  10) 


mod  10  . 


The  drawback  is  that  /3  will  take  longer  for  a  user  to  execute  in  his  head.  It 
requires  the  user  to  perform  23  additions  modulo  10  compared  with  three  in  the 
previous  schemes  /1  and  /2.  The  advantage  is  that  the  security  parameters  are 
quite  strong  (e.g.,  y(/3)  =  11,  r(/3)  =  12  and  s(/3)  =  6),  which  implies  that  a 
polynomial  time  adversary  needs  m  -  Cl  (n6)  challenge  response  pairs  to  recover 
the  secret  mapping.  If  n  =  100  then  the  adversary  would  need  around  1012 
challenge  response  pairs  before  he  could  break  UF-RCA  security.  Even  if  the 
adversary  runs  in  time  proportional  to  n  and  uses  the  attack  from  remark  2  he 
would  still  need  Q  (n1+5 5)  examples.  If  we  make  the  reasonable  assumption  that  a 
single  user  has  at  most  105  accounts  and  never  authenticates  to  any  single  account 
more  than  106  times  over  the  course  of  his  life  then  the  adversary  will  never  see 
enough  examples  to  recover  o. 


3.7.3  Open  Questions 

Can  we  precisely  characterize  the  functions  /  :  for  which  we  can  ef¬ 

ficiently  recover  o  after  seeing  O  (y/'(0/2j  challenge-response  pairs?  Feldman  et 
al.  [73]  gave  a  statistical  algorithm  that  recovers  the  secret  mapping  whenever 
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d  -  2  after  making  O  |n''(/)/2j  queries  to  1-MSTAT  While  we  show  that 

the  same  algorithm  can  be  used  to  recover  a  after  making  O  queries  to 

1-MSTAT  jn  our  candidate  human  computable  password  schemes  with 

d  =  10,  we  also  showed  that  these  results  do  not  extend  to  all  functions  /  :  — >  Z(f. 


Improving  Usability  Is  it  possible  to  improve  usability  by  designing  a  human 
computable  function  /  :  Z/  — >  Z100?  This  could  potentially  allow  the  user  to 
generate  a  secure  length  t  password  after  responding  to  only  i/2  challenges.  Our 
statistical  dimension  lower  bounds  also  hold  for  functions  /  :  Zjj  — >  Z(f2.  As 
before  the  challenge  would  be  designing  a  function  that  is  human  computable  and 
has  strong  security  properties  (e.g.,  s(/)  is  large  and  /  is  (Si,  S2)-hard  to  guess). 
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Chapter  4 

Empirical  Validation  of  User  Model 


4.1  Introduction 

In  this  chapter  we  discuss  our  ongoing  user  study  to  quantify  the  effects  of  rehearsal 
and  the  use  of  mnemonic  techniques  on  long  term  memory  retention.  We  are 
conducting  this  study  online  using  Amazon's  Mechanical  Turk  framework.  Our 
goal  is  to  empirically  evaluate  the  usability  model  presented  in  Chapter  2  as  well 
as  the  Shared  Cues  password  management  scheme. 

Specific  Rehearsal  Schedules:  Recall  that  the  usability  model  of  Chapter  2  was 
based  on  a  sufficient  rehearsal  assumption.  The  expanding  rehearsal  assumption 
from  Chapter  2  said  that  a  user  can  maintain  a  memory  by  rehearsing  once  during 
each  of  the  time  intervals  [2!cr,  2('*l)ffj,  where  z  G  IN  is  number  of  previous  rehearsals 
and  o  is  a  constant  that  measures  association  strength.  This  assumption  implies 
that  after  the  z'th  rehearsal  a  user  will  be  able  to  recall  a  memory  for  2ia  more  days 
without  any  additional  rehearsals.  Wozniak  and  Gorzelanczyk  [160]  conducted  an 
empirical  study  of  undergraduate  students  who  were  learning  vocabulary  words 
for  a  foreign  language.  Their  results  indicated  that  aw  varied  slightly  with  each 
vocabulary  word  zv  (  ow  was  smaller  for  difficult  vocabulary  words).  This  raises 
an  important  question.  What  specific  rehearsal  schedules  work  in  our  password 
context? 

Advantages  of  Mnemonic  Techniques:  In  Chapter  2  we  simply  used  o  =  1  to 
measure  the  usability  of  a  password  management  scheme,  whether  or  not  the 
password  management  scheme  used  mnemonic  techniques  to  help  users  remem¬ 
ber  their  passwords.  This  raises  an  important  question.  Does  the  use  of  mnemonic 
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techniques  (e.g.,  method  of  loci,  person-action-object  stories)  allow  us  to  safely 
adopt  a  rehearsal  schedule  with  longer  intervals  in  between  rehearsals?  More 

formally,  is  O mnemonic  ^  G rum— mnemonic? 


Interference:  Suppose  that  our  user  is  able  to  remember  a  person-action-object 
story  by  following  the  rehearsal  schedule  given  by  the  expanding  rehearsal  as¬ 
sumption  with  association  strength  o .  Can  our  user  memorize  n  person-action- 
object  stories  by  following  the  same  rehearsal  schedule  or  does  the  user  need  to 
follow  a  more  conservative  schedule  (smaller  value  of  a)  when  he  is  memorizing 
multiple  person-action-object  stories? 


Study  Overview  Each  participant  in  the  study  was  asked  to  memorize  several 
randomly  selected  actions  (e.g.,  'swallowing,'  'kicking')  and  several  randomly 
selected  objects  (e.g.,  'bike,'  'car').  Participants  assigned  to  the  mnemonic  group 
were  given  specific  instructions  about  how  to  memorize  the  actions  following  the 
Shared  Cues  password  management  scheme  in  Chapter  2.6.  To  help  participants  in 
the  mnemonic  group  memorize  one  of  their  action(s)  and  object(s)  each  participant 
was  shown  two  additional  photos  of  a  person  and  a  scene  and  was  asked  to  imagine 
the  corresponding  person-action-object  story  taking  place  inside  the  scene  (e.g., 
the  user  might  be  shown  a  photos  of  Bill  Gates  and  a  beach  and  asked  to  imagine 
"Bill  Gates  swallowing  a  bike  on  the  beach.").  Other  participants  were  assigned 
to  the  standard  group  and  were  simply  instructed  to  memorize  their  actions  and 
objects  (e.g.,  by  typing  in  their  words  several  times).  Participants  were  paid 
$0.50  for  completing  the  memorization  phase.  After  participants  memorized  their 
words  we  periodically  asked  them  to  return  to  rehearse  their  words.  During 
each  rehearsal  participants  in  the  mnemonic  group  were  shown  the  photos  of  the 
person  and  the  scene  as  a  cue  to  help  them  remember  the  associated  action  and 
object.  Participants  in  the  standard  group  were  simply  asked  to  recall  their  actions 
and  objects.  Each  participant  was  assigned  a  specific  rehearsal  schedule  (e.g., 
participants  in  the  aggressive  rehearsal  group  were  reminded  to  rehearse  on  the 
following  days:  1, 2, 4, 8, 16, 32, 64).  During  each  rehearsal  participants  were  given 
three  chances  to  remember  all  of  their  actions  and  objects  (e.g.,  their  password). 
To  encourage  participants  to  return  we  paid  $0.75  for  each  complete  rehearsal.  To 
incentivize  participants  to  remember  their  words  we  required  who  forgot  their 
words  to  re-complete  the  memorization  phase  before  paying  they  were  paid  $0.75. 
Participants  who  did  not  remember  their  words  during  a  rehearsal  were  not  asked 
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to  return  for  future  rehearsals. 


Preliminary  Results.  While  the  user  study  is  still  ongoing,  we  present  the  re¬ 
sults  from  completed  rehearsals  in  Section  4.4.  Our  results  provide  strong  em¬ 
pirical  evidence  that  user  can  remember  person-action-object  stories  by  following 
a  rehearsal  schedule  that  satisfies  the  expanding  rehearsal  assumption.  Specific 
Rehearsal  Schedules:  Our  results  demonstrate  the  benefit  of  having  several  early 
rehearsals.  Participants  who  followed  the  heavierstart  rehearsal  schedule  (a  sched¬ 
ule  with  several  rehearsals  on  day  one)  have  been  very  successful  at  remembering 
their  action-object  pairs  during  rehearsals.  Participants  following  the  aggressive 
rehearsal  schedule  (the  same  schedule  as  heavierstart,  but  without  the  extra  re¬ 
hearsal  on  day  1)  struggled  to  remember  all  of  their  action-object  pairs  during 
the  first  rehearsal  on  day  one  (25%  of  participants  forgot  at  least  one  of  their  sto¬ 
ries),  but  participants  who  survived  this  first  rehearsal  had  much  higher  success 
rates  during  all  of  the  ensuing  rehearsals.  Mnemonic  Advantage:  Our  results 
strongly  support  the  hypothesis  that  recall  is  significantly  improved  by  asking 
users  to  follow  specific  mnemonic  techniques  to  memorize  their  actions  and  ob¬ 
jects.  Participants  in  the  mnemonic  group  consistently  outperformed  participants 
in  standard  text  group  during  each  rehearsal.  Interference:  While  participants  in 
other  groups  did  well,  no  other  group  did  as  well  as  participants  who  only  had  to 
memorize  one  or  two  action-object  pairs  -  even  participants  in  groups  with  more 
frequent  rehearsals.  Participants  who  were  asked  to  memorize  only  one  or  two 
action-object  pairs  at  a  time  have  perfectly  remembered  their  stories  during  each 
rehearsal  phase. 


Organization.  In  Section  4.2  we  discuss  related  work.  We  then  overview  the 
design  of  our  user  study  in  Section  4.3.  While  the  study  is  still  ongoing  we  do  have 
some  preliminary  results  from  the  study.  We  present  these  results  in  Section  4.4. 


4.2  Related  Work 

Pimsleur[120]  proposed  a  rehearsal  schedule  to  help  people  memorize  unfamil¬ 
iar  vocabulary  words.  His  proposed  schedule  is  precisely  the  schedule  given 
by  expanding  rehearsal  assumption  with  the  association  strength  constant  set  to 
o  =  log2  5  ~  2.3  and  the  initial  delay  before  the  first  rehearsal  set  to  5  seconds 
(e.g.,  he  suggested  rehearsing  after  5  seconds,  25  seconds,  2  minutes,  10  minutes,  5 
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hours,  1  day,  5  days,  20  days).  Pimsleur  based  his  recommendations  on  previous 
empirical  studies[159,  pp.  726  ff].  The  application  SuperMemo[161]  uses  a  similar 
rehearsal  schedule  to  help  users  remember  flashcards.  Wozniak  and  Gorzelanczyk 
conducted  an  empirical  study  to  test  these  rehearsal  schedules [160].  In  their  study 
undergraduate  students  were  asked  to  memorize  and  rehearse  vocabulary  words 
for  a  foreign  language  by  following  a  rehearsal  schedule  very  similar  to  the  ex¬ 
panding  rehearsal  schedule1.  While  these  prior  studies  provide  strong  empirical 
evidence  for  the  expanding  rehearsal  assumption  from  Chapter  2  we  stress  that 
there  are  two  key  differences  in  our  study:  First,  because  we  are  asking  the  user 
to  memorize  secrets  that  will  be  used  to  form  passwords  our  rehearsal  schedule 
needs  to  be  conservative  enough  that  our  user  will  consistently  be  able  to  remem¬ 
ber  his  secrets  during  each  rehearsal.  In  other  studies  the  information  participants 
were  asked  to  memorize  (e.g.,  vocabulary  words  for  a  foreign  language)  was  not 
secret  so  if  the  participant  forgot  this  information  during  a  rehearsal  they  could 
simply  look  up  the  correct  answer.  However,  in  the  password  setting  the  secrets 
that  the  user  memorizes  should  not  be  written  down  because  they  are  sensitive  so 
we  will  not  always  be  able  to  refresh  the  user's  memory  if  he  forgets  his  secret. 
Second,  in  our  password  management  scheme  we  are  asking  users  to  memorize 
secret  person-action-object  stories  by  following  particular  mnemonic  techniques. 
Because  these  stories  may  be  easier  or  harder  to  memorize  than  other  information 
the  ideal  rehearsal  schedule  should  be  tailored  to  particular  mnemonic  techniques. 
Previous  studies  have  demonstrated  that  cued  recall  is  easier  than  pure  recall  (see 
for  example  [23])  and  that  we  have  a  large  capacity  for  visual  memories  [145]. 
However,  we  are  not  aware  of  any  prior  studies  which  compare  cued  recall  and 
pure  recall  when  participants  are  following  a  rehearsal  schedule  similar  to  the  one 
suggested  by  the  expanding  rehearsal  assumption. 

Bonneau  and  Schechter  conducted  a  user  study  in  which  participants  were 
encouraged  to  slowly  memorize  a  stronger  password  using  spaced  repetition[42]. 
Each  time  a  participant  returned  to  complete  a  distractor  task  he  was  asked  to  login 
by  entering  his  password.  During  the  first  login  the  participant  was  shown  four 
additional  random  characters  and  asked  to  type  them  in  after  his  password.  To 
encourage  participants  to  memorize  these  four  characters  they  would  intentionally 

1  Wozniak  and  Gorzelanczyk  tracked  each  students  performance  with  each  particular  vocab¬ 
ulary  word  and  used  that  information  to  estimate  how  difficult  each  word  was.  If  a  word  was 
deemed  'difficult'  then  the  length  of  the  time  interval  before  the  next  rehearsal  would  only  increase 
by  a  small  multiplicative  constant  (e.g.,  1.5)  and  if  the  word  was  judged  to  be  'easy'  then  this  time 
interval  would  increase  by  a  larger  multiplicative  constant  (e.g.,  4). 
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wait  a  few  seconds  before  displaying  them  to  the  user  the  next  time  he  was  asked 
to  login  to  complete  a  distractor  task.  Once  a  participant  was  able  to  login  several 
times  in  a  row  (without  waiting  for  the  characters  to  be  displayed)  they  would 
encourage  that  participant  to  memorize  four  additional  random  characters  in 
the  same  way  They  found  that  88%  of  participants  were  able  to  recall  their 
entire  password  without  any  prompting  three  days  after  the  study  was  completed. 
There  are  several  key  difference  between  their  study  and  ours:  First,  in  our  study 
participants  were  asked  to  memorize  their  entire  password  at  the  start  of  the  study. 
By  contrast,  Bonneau  and  Schechter  encouraged  participants  to  slowly  memorize 
their  passwords.  Second,  Bonneau  and  Schechter  did  not  tell  participants  that  their 
goal  was  to  slowly  memorize  a  strong  56  bit  password.  By  contrast,  in  our  study  we 
explicitly  told  participants  that  their  goal  was  to  remember  their  words  (without 
writing  them  down).  Finally,  participants  in  our  study  were  given  fewer  chances 
to  rehearse  their  passwords  and  were  asked  to  remember  their  passwords  over 
a  longer  duration  of  time  (3  months  vs  2  weeks).  Bonneau  and  Schechter  asked 
participants  to  login  90  times  over  a  two  week  period.  In  our  study  participants 
were  asked  to  rehearse  at  most  11  times  over  a  period  of  up  to  85  days. 


4.3  Study  Design 

Our  user  study  is  being  conducted  online  using  Amazon's  Mechanical  Turk  frame¬ 
work.  It  was  approved  by  the  Institutional  Review  Board  (IRB)  at  Carnegie  Mel¬ 
lon  University  under  IRB  protocol  FIS14-294:  Sufficient  Rehearsal  Schedules  and 
Mnemonic  Techniques.  After  participants  consented  to  participate  in  the  research 
study  we  randomly  assigned  each  participant  to  a  particular  study  condition. 
Members  in  a  particular  condition  were  asked  to  memorize  a  particular  number  of 
action-object  pairs  (either  1,2  or  4)  by  using  a  particular  memorization  technique 
(e.g.,  mnemonic  or  standard)  and  following  a  particular  rehearsal  schedule  (e.g., 
aggressive,  conservative,  heavystart).  After  we  assigned  participants  to  a  study 
condition  we  asked  each  participant  to  complete  the  memorization  phase.  During 
the  memorization  phase  each  participant  was  given  several  randomly  generated 
actions  (e.g.,  swallowing)  and  several  randomly  generated  objects  (e.g.,  bike),  and 
asked  to  memorize  each  word.  Participants  in  mnemonic  conditions  were  given 
specific  instructions  about  how  to  memorize  their  words.  We  paid  participants 
$0.50  for  completing  the  memorization  phase.  After  each  participant  completed 
the  memorization  phase  we  asked  them  to  return  periodically  to  rehearse  their 
words.  To  encourage  participants  to  return  we  paid  participants  $0.75  for  each 
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rehearsal  —  whether  or  not  they  were  able  to  remember  the  words.  If  a  participant 
forgot  the  action  and  the  object  then  we  reminded  the  participant  of  the  actions  and 
objects  that  he  had  memorized  and  asked  that  user  to  complete  the  memorization 
phase  again. 

Below  we  provide  examples  of  the  instructions  given  to  each  participant  during 
the  memorization  and  rehearsal  phases. 

4.3.1  Recruitment  Text 

On  the  Mechanical  Turk  website,  participants  were  recruited  with  the  following 
text: 


Participate  in  a  Carnegie  Mellon  University  research  study  on  mem¬ 
ory.  You  will  be  asked  to  memorize  and  rehearse  random  words  for  a 
50  cent  payment.  After  you  complete  the  memorization  phase,  we  will 
periodically  ask  you  to  return  to  check  if  you  still  remember  the  words. 
If  you  forget  the  words  then  we  will  remind  you  of  the  words  and  ask 
you  to  complete  the  memorization  phase  again.  You  will  be  paid  75 
cents  upon  the  completion  of  each  rehearsal. 

Because  this  is  a  memory  study  we  ask  that  you  do  not  write  down 
the  words  that  we  ask  you  to  memorize.  You  will  be  paid  for  each 
completed  rehearsal  phase  —  even  if  you  forgot  the  words. 


After  each  participant  consented  to  participate  in  the  research  study  they  were 
assigned  to  the  mnemonic  group  or  to  the  standard  group.  We  then  asked  each 
participant  to  complete  the  memorization  phase  of  the  study. 

4.3.2  Memorization  Phase 

Mnemonic  Group 

We  first  describe  the  memorization  phase  for  participants  assigned  to  the  mnemonic 
group.  Participants  in  the  mnemonic  group  were  first  given  the  following  instruc¬ 
tions. 
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Instructions 


This  study  is  being  conducted  as  part  of  a  Carnegie  Mellon  Univer¬ 
sity  research  project.  It  is  important  that  you  answer  questions  honestly 
and  completely  Please  take  a  minute  to  read  the  following  instructions. 

The  goal  of  this  study  is  to  quantify  the  effects  of  rehearsal  and 
the  use  of  mnemonic  techniques  on  long  term  memory  retention.  In 
this  study  you  will  be  asked  to  memorize  and  rehearse  eight  random 
words  (four  actions  and  four  objects).  During  the  first  phase  we  will 
ask  you  to  memorize  the  eight  random  words  -  you  will  be  paid  $0.50 
upon  completion  of  the  memorization  phase.  After  you  complete  the 
memorization  phase  we  will  periodically  ask  you  to  return  via  email  to 
check  if  you  still  remember  the  words.  If  you  forget  the  words,  we  will 
remind  you  of  the  words  and  ask  you  to  complete  the  memorization 
phase  again.  You  will  be  paid  $0.75  upon  the  completion  of  each 
rehearsal. 

Important:  Because  this  is  a  memory  study  we  ask  that  you  do  not 
write  down  the  words  we  ask  you  to  memorize.  You  will  be  paid  for 
each  completed  rehearsal  phase  -  even  if  you  forgot  the  words.  You 
have  been  assigned  to  the  mnemonic  group,  which  means  that  we  give 
you  specific  instructions  about  how  to  memorize  the  words.  One  of  the 
purposes  of  this  study  is  to  determine  how  effective  certain  mnemonic 
techniques  are  during  the  memorization  task.  We  ask  that  you  follow 
the  directions  exactly  -  even  if  you  would  prefer  to  memorize  the  words 
in  a  different  way. 

After  participants  finished  reading  the  instructions  the  memorization  phase 
proceeded  as  follows: 


Memorization  Steps.  Step  0)  Initially,  participants  were  shown  a  photo  of  a 
scene  (e.g..  Figure  4.1a).  Participants  were  then  asked  to  select  a  famous  person  or 
character  (e.g.,  Darth  Vader)  and  were  shown  a  photograph  of  the  famous  person 
that  they  selected  —  see  Figure  4.1b.  We  then  generated  a  random  action  (e.g., 
bribing)  and  a  random  object  (e.g.,  roach).  See  Section  4.3.6  for  the  lists  of  people, 
actions  and  objects  used  in  the  study. 

Figures  4.1a  and  4.1b  illustrates  Step  0.  After  we  generated  the  random  action 
and  the  random  object  we  asked  the  participant  to  memorize  their  action  and  their 
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(a)  Scene:  Lily  Pads  on  the  Amazon  River  (b)  Person:  Darth  Vader 

Figure  4.1:  Memorization  Step  0.  Scene  and  Person. 


object  by  completing  Steps  1-3.  Figure  4.2a  illustrates  these  steps.  Step  1)  We  asked 
participants  to  imagine  the  person  they  selected  performing  the  action  in  the  given 
scene  (e.g.,  imagine  Darth  Vader  bribing  the  roach  on  the  lily  pad).  Step  2)  We 
asked  each  participant  to  make  up  a  story  involving  their  person,  action  and  object 
and  enter  it  (e.g.,  "Darth  Vader  is  bribing  a  roach")2.  Step  3)  Select  a  photograph  of 
the  action  and  a  photograph  of  the  object,  and  type  in  the  action  and  the  object  two 
more  times.  Step  4)  We  asked  most  participants  to  repeat  Steps  0  through  3  four 
times  using  a  new  scene  (e.g.,  a  baseball  field  or  a  hotel  room  underneath  the  sea), 
a  new  famous  person/character  and  a  new  —  randomly  selected  —  action-object 
pair  during  each  repetition.  Thus,  most  participants  memorized  a  total  of  eight 
words  (four  actions  and  four  objects).  Step  5)  Finally,  we  asked  each  participant 
to  complete  a  rehearsal  phase  (See  Figure  4.2b). 


Standard  Group 

We  next  describe  the  memorization  phase  for  participants  assigned  to  the  stan¬ 
dard  group.  Participants  in  the  standard  group  were  first  given  the  following 
instructions. 

2We  required  participants  to  type  in  a  story  that  contained  all  of  their  words  in  the  correct  order 
(Person- Action-Object) 
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Your  words  are:  bribing  roach. 

Imagine  the  person  you  have  selected  performing  this  action  in  the  scene  above.  Type  in  a 
short  story  involving  the  person,  action,  and  object.  Make  sure  your  words  appear  in  your 
story,  in  the  correct  order.  Select  representative  images  for  the  actions  and  objects  above  by 
clicking  on  the  placeholder  images  beneath  the  words. 

Story: 

Type  your  words  twice  in  the  boxes  below. 

Action  Object 


Continue  j 


(a)  Memorization  Steps  1-3.  Darth  Vader  bribing  a  roach  on  the  lily  pad. 


Darth  Vader 


Select  an  Option 


Select  an  Option 


bowing 

bribing 

burying 


(b)  Rehearsal  Phase.  Darth  Vader  and  the  photo  of  the  lily  pads  on  the  Amazon  River 
are  a  cue  to  aid  memory  recall. 
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Instructions 


This  study  is  being  conducted  as  part  of  a  Carnegie  Mellon  Univer¬ 
sity  research  project.  It  is  important  that  you  answer  questions  honestly 
and  completely  Please  take  a  minute  to  read  the  following  instructions. 

The  goal  of  this  study  is  to  quantify  the  effects  of  rehearsal  and 
the  use  of  mnemonic  techniques  on  long  term  memory  retention.  In 
this  study  you  will  be  asked  to  memorize  and  rehearse  eight  random 
words  (four  actions  and  four  objects).  During  the  first  phase  we  will 
ask  you  to  memorize  the  eight  random  words  you  will  be  paid  $0.50 
upon  completion  of  the  memorization  phase.  After  you  complete  the 
memorization  phase  we  will  periodically  ask  you  to  return  via  email 
to  check  if  you  still  remember  the  words.  If  you  forget  the  words,  we 
will  remind  you  of  the  words  and  ask  you  to  complete  the  memoriza¬ 
tion  phase  again.  You  will  be  paid  $0.75  upon  the  completion  of  each 
rehearsal. 

Important:  Because  this  is  a  memory  study  we  ask  that  you  do  not 
write  down  the  words  we  ask  you  to  memorize.  You  will  be  paid  for 
each  completed  rehearsal  phase  even  if  you  forgot  the  words. 


After  participants  finished  reading  the  instructions  the  memorization  phase 
proceeded  as  follows: 


Memorization  Steps.  Step  0)  We  generated  a  random  action  and  a  random  object, 
and  displayed  these  words  to  the  user.  Step  1)  We  asked  each  participant  to  spend 
one  minute  memorizing  his  words.  We  suggested  that  participants  imagine  a 
person  performing  the  action  with  the  object.  Step  2)  We  asked  each  participant  to 
type  in  a  story  which  includes  the  action  and  the  object  in  the  correct  order.  Step  3) 
We  asked  each  participant  to  type  in  the  both  words  two  times  —  paying  attention 
to  the  order.  See  Figure  4.2  for  an  example  of  Steps  0-3.  Step  4)  Most  participants 
were  asked  to  complete  Steps  0  through  3  four  times  to  memorize  a  total  of  eight 
words  (four  actions  and  four  objects).  Step  5)  Finally,  we  asked  each  participant 
to  complete  a  rehearsal  phase. 
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Assigned  words  #1  of  4: 


kissing  sauce 

Your  words  are:  kissing  sauce. 

Spend  a  minute  to  memorize  these  words  by  imagining  a  story  involving  a  person  performing 
the  action  with  the  object.  When  you  are  ready,  type  in  your  story,  which  must  include  your 
words,  in  the  correct  order. 

Story: 

Type  your  words  twice  in  the  boxes  below. 

Action  Object 


[  Continue 


Figure  4.2:  User  Study:  Non-Mnemonic  Group  Memorization  Phase 

4.3.3  Rehearsal  Phase 

Each  participant  was  assigned  a  particular  rehearsal  schedule.  The  particular 
times  that  we  ask  the  participant  to  return  were  given  by  the  rehearsal  schedule 
that  participant  was  assigned  to  use  (see  Table  4.1).  We  e-mailed  participants  to 
remind  them  to  return  for  each  rehearsal: 

Dear  Carnegie  Mellon  study  participant:  Please  return  to  (url)  to 
participate  in  the  next  part  of  the  memory  study  If  you  do  not  return 
promptly  upon  receiving  this  email,  you  might  not  be  considered  for 
future  phases  of  the  study  You  will  receive  a  $0.75  bonus  payment  for 
completing  this  task  and  it  should  take  less  than  five  minutes. 

Remember  that  you  should  not  write  down  the  words  that  were 
assigned  to  you.  You  will  be  paid  for  each  completed  rehearsal  phase 
-  even  if  you  forgot  the  words. 

There  is  no  need  to  return  to  Mechanical  Turk  and  find  the  HIT  to 
receive  the  bonus,  this  bonus  and  any  future  bonuses  will  be  applied  to 
this  MTurk  account  automatically  as  you  complete  each  phase.  Please 
do  not  attempt  to  take  the  HIT  again  on  MTurk  as  this  will  result  in  a 
rejection. 

If,  for  any  reason,  you  do  not  want  to  complete  the  study,  please 
reply  to  this  email  and  let  us  know  why,  so  we  can  improve  our  protocol 
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for  future  studies. 

Thank  you!  The  Carnegie  Mellon  University  Study  Team 
We  describe  the  rehearsal  phase  below: 


Mnemonic  Group 

Each  participant  from  the  mnemonic  group  was  shown  the  picture  of  a  scene  and 
the  picture  of  the  person  that  he  chose  while  memorizing  his  first  story  during  the 
memorization  phase  (see  Figure  4.2b).  We  then  asked  each  participant  to  recall 
the  person  action  object  story  he  made  up  and  enter  the  associated  action  and  the 
object.  If  the  participant  was  correct  then  we  moved  on  to  the  next  story.  If  the 
participant  was  incorrect  then  we  asked  the  participant  to  try  again.  After  three 
incorrect  guesses  we  asked  the  participant  to  repeat  the  memorization  phase  with 
the  same  actions  and  objects,  and  try  again.  Once  the  participant  correctly  entered 
all  four  action-object  pairs  the  rehearsal  is  finished. 


Standard  Group 

Each  participant  from  the  standard  group  was  simply  asked  to  recall  the  random 
actions  and  the  random  objects  that  he  was  given  during  the  memorization  phase. 
If  the  participant  was  incorrect  then  we  asked  the  participant  to  try  again.  After 
three  incorrect  guesses  we  asked  the  participant  to  repeat  the  memorization  phase, 
and  try  again.  The  rehearsal  was  finished  when  the  participant  enters  in  all  of  the 
actions  and  objects  correctly. 


4.3.4  Follow  Up  Survey 

Some  participants  did  not  return  to  rehearse  their  stories  during  the  rehearsal 
phase.  We  cannot  tell  whether  or  not  these  participants  would  have  remembered 
their  passwords  if  they  had  returned.  Instead  we  can  only  report  the  fraction  of 
participants  who  remembered  their  passwords  among  those  who  returned  for  each 
rehearsal  during  the  study.  There  are  several  reasons  why  a  participant  may  not 
have  returned  (e.g.,  too  busy,  did  not  get  the  follow  up  message  in  time,  convinced 
s/he  would  not  remember  the  password).  If  participants  do  not  return  because 
they  are  convinced  that  they  would  not  remember  the  password  then  this  could 
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be  a  source  of  bias  (e.g.,  we  would  be  selecting  participants  who  are  confident 
that  they  remember  the  story).  Our  hypothesis  is  that  the  primary  reason  that 
participants  do  not  return  is  because  they  were  too  busy,  because  they  did  not  get 
our  follow  up  message  in  time  or  because  they  do  not  interested  in  interacting 
with  us  outside  of  the  initial  Mechanical  Turk  Hit,  and  not  because  they  were 
convinced  that  they  would  not  remember  the  story  In  order  to  test  our  hypothesis 
we  sent  a  follow  up  survey  to  all  participants  who  did  not  return  to  complete  a 
rehearsal  phase.  The  purpose  of  this  survey  was  to  allow  us  to  check  for  sources 
of  biases.  Participants  were  paid  25  cents  for  completing  this  survey.  The  survey 
is  described  below: 

You  are  receiving  this  message  because  you  recently  participated  in 
a  CUPS  Memory  Study  at  CMU.  A  while  ago  you  received  an  e-mail 
to  participate  in  a  follow  up  test.  We  would  like  to  ask  you  you  to 
complete  a  quick  survey  to  help  us  determine  why  participants  were 
not  able  to  return  to  complete  this  follow  up  study.  The  survey  should 
take  less  than  a  minute  to  complete,  and  you  will  be  paid  25  cents  for 
completing  the  survey.  The  survey  consists  of  one  question.  Which  of 
the  following  reasons  best  describes  why  you  were  unable  to  return  to 
take  the  follow  up  test? 

A  I  no  longer  wished  to  participate  in  the  study. 

B  I  was  too  busy  when  I  got  the  e-mail  for  the  follow  up  test. 

C  I  did  not  see  the  e-mail  for  the  follow  up  test  until  it  was  too  late. 

D  I  was  convinced  that  I  would  not  be  able  to  remember  the  words/stories 
that  I  memorized  when  I  received  the  e-mail  for  the  follow  up  test. 

E  I  generally  do  not  participate  in  follow  up  studies  on  mechanical 
turk. 


Discussion  It  is  possible  that  some  participants  will  choose  not  to  participate 
in  the  follow  up  survey.  However,  in  our  case  their  decision  not  to  participate  is 
valuable  information  which  supports  our  hypothesis. 


4.3.5  Rehearsal  Schedules 

Each  user  was  assigned  one  of  the  rehearsal  schedules  from  Table  4.1.  If  the  par¬ 
ticipant  was  assigned  to  the  Aggressive  rehearsal  schedule  then  we  would  send 
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that  participant  a  reminder  to  rehearse  1  day  after  the  memorization  phase.  If  that 
participant  successfully  completes  the  first  rehearsal  phase  then  we  will  send  that 
participant  another  reminder  to  rehearse  2  days  after  the  memorization  phase,  and 
the  next  reminder  would  come  on  day  four,  etc...  The  final  rehearsal  would  take 
place  on  day  64. 

We  use  the  following  syntactic  pattern  to  denote  a  group  of  participants  (Mem¬ 
orization  Technique) .(Rehearsal  Schedule) .(Number  of  action-object  pairs  memo¬ 
rized).  For  example,  a  participant  in  the  group  mnemonic _aggressive4  refers  to  a 
user  who  was  asked  to  memorize  four  actions  and  four  objects  using  the  mnemonic 
techniques  we  suggested  and  to  rehearse  his  person-action-object  stories  follow¬ 
ing  the  Aggressive  rehearsal  schedule  from  Table  4.1.  Because  most  participants 
were  asked  to  memorize  four  actions  and  four  objects  we  will  sometimes  drop  the 
number  at  the  end  unless  the  participants  was  only  asked  to  memorize  one  or  two 
action-object  pairs. 

Remark  5.  We  use  the  label  " Aggressive "  to  refer  to  a  rehearsal  schedule  that  zve  believe 
will  be  more  challenging  for  each  participant  (e.g.,  the  length  of  time  between  consecutive 
rehearsals  grows  at  a  faster  rate).  Similarly,  we  use  the  label  " Conservative "  to  refer  to  a 
rehearsal  schedule  that  we  believe  will  be  less  challenging  for  each  participant. 


Schedule 

Multiplier 

Base 

Rehearsal  Times 

Aggressive 

x2 

1  Day 

1,  2, 4,  8, 16,  32,  64 

Conservative 

xl.5 

1  Day 

1,  2.5,  5,  8, 13,  21,32,49,74 

Very  Conservative 

xl.5 

0.5  days 

0.5,  1.25,  2.4,  4,  6.5, 

10,16,24,37,56,85 

Heavy  Start 

x2 

1  Day 

0.1,  0.5, 1,  2, 4,  8, 16,  32,  64 

Heavier  Start 

x2 

30  min 

1  hr,  2  hr,  4  hr,  8  hr,  1  day,  2 
,  4 , 8, 16 , 32 , 64 

Table  4.1:  Rehearsal  Schedules 


4.3.6  List  of  People,  Actions  and  Objects  from  the  User  Study 

Here  are  a  list  of  the  people,  actions  and  objects  we  used  in  the  study. 
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People:  Bill  Gates,  Bill  Clinton,  George  W  Bush,  Lebron  James,  Kobe  Bryant,  Brad 
Pitt,  Darth  Vader,  Luke  Sky  walker,  Frodo,  Gandalf,  Michael  Jordan,  Tiger  Woods, 
Michael  Phelps,  Angelina  Jolie,  Albert  Einstein,  Oprah  Winfrey,  Nelson  Mandela, 
Bart  Simpson,  Homer  Simpson,  Adolf  Hitler,  Steve  Jobs,  Mark  Zuckerberg,  Justin 
Timberlake,  Jay  Z,  Beyonce,  Kim  Jong  Un,  Joe  Biden,  Barack  Obama,  Pope  Francis, 
Rand  Paul,  Ron  Paul,  Ben  Afleck,  Hillary  Clinton,  Jimmy  Fallon 


Actions:  gnawing,  mowing,  rowing,  oiling,  egging,  waving,  bowing,  seizing, 
stewing,  signing,  searing,  bribing,  swallowing,  sucking,  saving,  sipping,  tazing, 
tattooing,  drying,  dueling,  dodging,  tugging,  taping,  nosing,  hunting,  numb¬ 
ing,  inhaling,  knifing,  nipping,  muddying,  miming,  marrying,  mauling,  mashing, 
mugging,  moving,  mopping,  racing,  riding,  reeling,  reaching,  raking,  lassoing, 
welding,  aligning,  leashing,  elbowing,  juicing,  shining,  sheering,  judging,  chok¬ 
ing,  chipping,  coating,  concealing,  destroying,  kissing,  aiming,  kicking,  punch¬ 
ing,  canning,  combing,  gluing,  cooking,  giving,  copying,  vising,  voting,  fanning, 
fuming,  firing,  fishing,  high  fiving,  batting,  burying,  plowing,  puking,  popping, 
tasting,  pulling,  climbing,  weeping,  swimming,  stretching,  following,  paddling, 
howling,  smelling,  rolling,  waking,  jumping 


Objects:  saw,  teacup,  hen,  ammo,  arrow,  owl,  shoe,  cow,  hoof,  boa,  sauce,  suit, 
snow,  piranha,  chainsaw,  shark,  tiger,  snake,  razor-blade,  sumo,  seal,  sock,  safe, 
soap,  daisy,  toad,  dime,  tire,  dish,  duck,  dove,  ant,  onion,  wiener,  nail,  navy,  menu, 
mummy,  hammer,  mail,  microphone,  horse,  rat,  iron,  ram,  pin,  roach,  rib,  lion, 
lime,  leach,  lock,  leaf,  cheese,  jet,  chain,  chime,  gyro,  chili,  jeep,  goose,  cat,  wagon, 
igloo,  couch,  cake,  coffee,  cab,  vase,  foot,  phone,  waffle,  fish,  bus,  patty,  bunny, 
bomb,  pill,  bush,  bike,  beehive,  puppy,  kite,  canoe,  boar,  apple,  moon,  moose, 
tepee,  ditch,  key,  shoe,  home,  toe,  nose,  cheetah 


4.4  Preliminary  Results 

We  say  that  a  participant  survived  through  rehearsal  i  if  that  participant  correctly 
remembered  all  of  his  stories  in  <  3  attempts  during  rehearsals  j  =  1 ,...,/,  and 
we  used  Survived  (z)  to  denote  the  number  of  participants  who  survived  through 
rehearsal  i  (Survived  (0))  denotes  the  number  of  initial  participants).  Table  4.2 
shows  how  many  participants  survived  each  round.  We  stress  that  there  are 
two  reasons  that  a  participant  might  not  survive  through  rehearsal  i:  (1)  the 
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Initial 

Survived(z') 

Condition  /  i 

(i  =  0) 

1 

2 

3 

4 

5 

6 

7 

9 

9 

10 

11 

mnemonic  .heavier  start 

73 

38 

27 

26 

24 

22 

22 

22 

22 

22 

20 

18 

mnemonic  heavystart 

80 

48 

41 

38 

36 

36 

35 

33 

27 

7 

text  heavystart 

100 

63 

52 

51 

51 

48 

46 

39 

31 

7 

mnemonic_aggressive_real 

75 

50 

42 

40 

38 

36 

30 

mnemonic  .aggressive  _2 

81 

50 

42 

42 

41 

38 

37 

36 

33 

mnemonic  .aggressive  .1 

86 

64 

52 

49 

49 

47 

46 

45 

41 

mnemonic.veryconservative 

83 

62 

52 

51 

51 

49 

Table  4.2:  Survived 


participant  failed  to  return  to  rehearse  in  a  timely  manner  when  we  asked,  or  (2)  the 
participant  failed  to  remember  all  of  his  stories  in  <  3  attempts.  Because  we  used 
the  mnemonic  Jheavy start  and  the  text  Jheavy start  conditions  for  our  pilot  study  we 
have  the  results  from  more  rehearsals  under  those  conditions.  Observe  that  for  the 
first  rehearsal  we  have  the  largest  drop-off  under  the  heavystart  and  heavierstart 
conditions.  This  may  seem  paradoxical  because  we  would  expect  participants  in 
these  conditions  to  have  a  better  chance  of  remembering  their  words  because  they 
had  less  time  to  forget  them.  Indeed,  this  is  true  for  participants  who  returned  to 
complete  the  first  rehearsal.  However,  many  participants  in  these  conditions  were 
not  able  to  return  in  a  timely  manner  because  less  time  had  elapsed. 


Total  Survival  Rate  Among  Participants  who  Always  Returned  Figures  4.3a 
and  4.3b  shows  the  total  survival  rate  for  each  completed  rehearsal  under  each 
study  condition  for  participants  who  always  returned  when  we  asked.  More 
specifically.  Figures  4.3a  and  4.3b  plot  the  value  of 


Survived  (z)  / 


'  t 

Survived  (t)  +  ^  Failed  (/) 


where  t  denotes  the  total  number  of  rehearsals  and  Failed  (j)  denotes  the  number 
of  participants  who  survived  through  rehearsals  1, —  1  and  did  not  remember 
all  of  their  words  on  rehearsal  j  in  <  3  attempts.  We  use  Time  (/)  to  denote 
the  time  of  the  z'th  rehearsal.  Observe  that  each  of  the  curves  in  these  figures 
is  monotonically  decreasing.  This  is  because  a  participant  who  did  not  survive 
round  i  will  also  be  counted  as  a  participant  who  did  not  survive  round  j  for  each 
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Figure  4.3:  Total  Survival  with  Failures  Carried  Forward  |  Survive^^fd(pailed^^  j  vs 
Time  ( z ). 


j  >  i.  While  we  include  this  data  for  completeness  we  emphasize  that  this  view  is 
overly  pessimistic.  For  example,  consider  a  participant  who  correctly  remembered 
his  stories  during  the  first  three  rehearsals,  but  was  not  able  to  return  for  the  fourth 
rehearsal  (e.g.,  because  he  went  on  vacation).  The  results  of  this  participant  would 
be  dropped.  However,  if  the  same  participant  had  failed  during  round  three  then 
his  results  would  be  included  because  we  would  not  have  asked  him  to  return 
for  the  fourth  rehearsal  while  he  was  on  vacation.  Suppose  that  participants 
who  return  for  rehearsal  one  succeed  with  probability  0.99  and  that  participants 
who  return  for  rehearsal  i  >  1  succeed  with  probability  1.  If  participants  always 
returned  to  rehearse  when  we  asked  them  to  then  the  survival  rate  after  rehearsal 
i  would  be  99%  for  all  i  >  0.  However,  if  participants  are  not  able  to  return  to 
complete  each  rehearsal  phase  independently  with  probability  p  >  0  then  the  total 
survival  rate  among  participants  who  always  returned  will  always  tend  to  0. 


Conditional  Survival  Probability  Figures  4.4a  and  4.4b  show  the  conditional 
probability  of  survival  (e.g.,  %  of  those  who  participants  who  correctly  remem¬ 
bered  all  of  their  stories  in  <  3  attempts  on  the  i'th  rehearsal  conditioned  on  the 
event  that  the  participant  survived  rounds  j  =  1, ...  ,i—l  and  returned  for  rehearsal 
i).  More  formally,  these  Figures  plot  the  value  of  Survived  (i)  /Returned  (/),  where 
Returned  (z)  counts  the  number  of  participants  who  survived  rounds  j  =  1, . . . ,  z  —  1 
and  returned  for  rehearsal  z.  Figures  4.4c  and  4.4d  show  the  same  data  with  a 
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Figure  4.4:  Conditional  Survival:  Survived  (z)  /Returned  (z)  vs  Time  (z). 


different  x-axis  (e.g.,  mean  time  since  last  visit  instead  of  mean  time  since  first 
visit).  Notice  that  these  curves  are  not  necessarily  monotonic.  For  example,  in 
the  mnemonic  .aggressive  condition  25%  of  users  failed  to  remember  all  four  of 
their  stories  during  the  first  rehearsal.  However,  every  participant  who  survived 
to  rehearsal  three  also  survived  to  rehearsal  four  in  the  mnemonic  aggressive  con¬ 
dition.  This  illustrates  the  advantage  of  having  several  immediate  rehearsals.  In 
the  heavystart  conditions  the  last  rehearsal  on  day  64  was  the  most  difficult  for 
participants. 
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Figure  4.5:  EstimatedSurvival  ( z )  vs  Time  (z) 


Estimated  True  Total  Survival  Rate  We  can  use  our  conditional  success  proba¬ 
bilities  to  estimate  what  the  true  survival  rate  would  have  been  under  ideal  cir¬ 
cumstances  (e.g.,  all  participants  are  always  available  to  return  to  rehearse  when 
we  asked  them).  Our  results  are  shown  in  Figures  4.5a  and  4.5b.  We  use  the 
estimate 


EstimatedSurvival  (z)  = 


n'  Survived  (/) 
Returned(;) 


where  burned)/)  denotes  our  empirical  estimate  of  the  conditional  probability  that 
a  participant  will  survive  round  j  given  that  the  participant  survived  all  previous 
rounds  and  returned  for  rehearsal  j. 


Survey  Results  We  surveyed  61  participants  who  did  not  return  to  complete 
their  first  rehearsal  to  ask  them  why  they  were  not  able  to  return.  The  results 
from  our  survey  are  presented  in  Figures  4.6a  and  4.6b.  The  results  from  our 
survey  strongly  supports  our  hypothesis  that  the  primary  reason  that  participants 
do  not  return  is  because  they  were  too  busy,  because  they  did  not  get  our  follow  up 
message  in  time  or  because  they  do  not  interested  in  interacting  with  us  outside 
of  the  initial  Mechanical  Turk  Hit,  and  not  because  they  were  convinced  that  they 
would  not  remember  the  story. 
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Which  of  the  following  reasons  best 
describes  why  you  were  unable  to 
return  to  take  the  follow  up  test? 


Which  of  the  following  reasons  best 
describes  why  you  were  unable  to 
return  to  take  the  follow  up  test? 


(a)  Participants  who  Completed  the  Survey  (b)  All  Invited  Survey  Participants 


Figure  4.6:  Survey:  Which  of  the  following  reasons  best  describes  why  you  were 
unable  to  return  to  take  the  follow  up  test? 

A:  I  no  longer  wished  to  participate  in  the  study 

B:  I  was  too  busy  when  I  got  the  e-mail  for  the  follow  up  test. 

C:  I  did  not  see  the  e-mail  for  the  follow  up  test  until  it  was  too  late. 

D:  I  was  convinced  that  I  would  not  be  able  to  remember  the  words/stories  that  I 
memorized  when  I  received  the  e-mail  for  the  follow  up  test. 

E:  I  generally  do  not  participate  in  follow  up  studies  on  Mechanical  Turk. 

F:  Participant  did  not  respond  to  survey. 
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Fun  We  had  several  participants  e-mail  us  to  tell  us  how  much  fun  they  were 
having  memorizing  person-action-object  stories.  The  results  from  our  survey  are 
also  consistent  with  the  hypothesis  that  memorizing  person-action-object  stories 
is  fun  (e.g.,  no  participants  said  that  they  no  longer  wished  to  participate  in  the 
study). 


4.4.1  Discussion 

Mnemonic  Advantage.  Our  results  strongly  support  the  hypothesis  that  omnemonjc  > 
Onon-mrumonic-  Participants  consistently  did  better  in  the  mnemonic  Jheavy start  con¬ 
dition  than  in  the  text  heavystart  condition.  For  example,  compare  the  estimated 
survival  rates  for  mnemonic _heavy start  and  text_heavystart  in  Figure  4.5a  or 
compare  the  conditional  survival  rates  for  mnemonic  mnemonic  Jaeavy start  and 
text_heavystart  in  Figure  4.4a.  Even  the  pessimistic  total  survival  rates  shown  in 
Figure  4.3a  support  this  hypothesis. 


Benefit  of  Several  Early  Rehearsals.  Participants  did  very  well  in  conditions 
which  involve  several  early  rehearsals  like  the  mnemonic  Jheavy start  and  the 
mnemonic  heavierstart  conditions.  In  the  mnemonic  veryconservative  condition 
a  few  participants  struggled  during  the  first  rehearsal,  but  have  been  perfect  af¬ 
ter  that.  In  the  mnemonic.aggressive  condition  participants  also  struggled  most 
during  the  first  rehearsal. 


Interference.  While  a  few  participants  struggled  under  the  mnemonic  .heavy  start 
condition  this  was  not  the  case  when  we  only  asked  participants  to  memorize  one 
or  two  person-action  object  stories.  This  is  likely  because  participants  had  less  en¬ 
ergy  to  devote  to  memorizing  the  third  and  fourth  story  in  the  mnemonic  heavystart 
condition.  Our  results  do  not  mean  that  users  are  incapable  of  remembering  mul¬ 
tiple  stories.  In  fact,  most  participants  in  the  mnemonic  Jaeavy start  were  able  to 
remember  all  four  of  their  stories.  However,  our  results  do  indicate  that  it  may  be 
prudent  to  either  adopt  a  rehearsal  schedule  with  many  early  rehearsals  whenever 
the  user  is  memorizing  multiple  stories  at  once,  or  space  out  the  memorization 
process  so  that  users  are  not  memorizing  multiple  stories  at  once. 
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Chapter  5 

Password  Composition  Policies:  A 
Defense  Against  Online  Attacks 

5.1  Introduction 


Imagine  a  web  surfer,  an  online  shopper,  or  a  reviewer  in  a  prominent  CS  and 
Economics  conference  who  logs  on  for  the  first  time  to  a  server;  so  that  she  can 
sign  up  for  some  service,  place  a  shopping  order,  or  view  a  list  of  assigned  papers. 
Such  a  user  registers  on  the  server  by  choosing  a  username  and  picking  a  password. 
Naturally,  our  user's  first  attempt  at  picking  a  password  is  her  favorite  combination 
'12  34  5  6',  which  the  server  declines.  She  then  has  to  pick  a  password  that  follows 
certain  guidelines:  of  suitable  length,  involving  lower-  and  upper-case  letters,  with 
numbers  or  special  characters,  etc.  Such  password  composition  policies  defend  against 
the  "first  line"  of  attack  -  guessing  attacks  by  uninformed  attackers  (attackers  with 
no  previous  knowledge  of  the  user  whose  account  they  are  trying  to  break  into). 

Password  composition  policies  are  a  necessity  because  —  without  them  —  user- 
selected  passwords  are  predictable.  Indeed,  many  unrestricted  users  would  select 
simple  passwords  like  '123456',  'password'  and  'letmein'  [66].  Furthermore,  this 
issue  is  of  great  importance  to  today's  economy.  Passwords  are  commonly  used  in 
electronic  commerce  to  protect  financial  assets.  In  fact,  the  passwords  themselves 
have  financial  value.  Symantec  reported  that  compromised  passwords  are  sold  for 
between  $4  and  $30  on  the  black  market  [79],  and  a  2004  Gartner  case  study  [158] 
estimated  that  it  cost  a  large  firm  over  $17  per  password-reset  call.  Nevertheless, 
existing  password  composition  policies  are  typically  not  principled,  and  do  not 
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necessarily  result  in  less  common  passwords.  For  example,  studies  show  that 
users  respond  to  restrictions  in  predictable  ways  [97],  or  pick  weaker  passwords 
due  to  user-fatigue  [56, 102], 

In  this  chapter,  we  initiate  the  algorithmic  study  of  password  composition  poli¬ 
cies.  Such  policies  restrict  the  space  of  passwords  to  a  subset  of  allowed  passwords, 
and  force  each  user  to  pick  a  password  in  this  subset.  Thus,  n  users  induce  a  dis¬ 
tribution  over  passwords  where  for  a  password  w,  Pr[w]  =  ^  |  {i  :  i  picks  w  l|-  By 
declaring  different  subsets  of  allowed  passwords,  different  password  composi¬ 
tion  policies  induce  different  distributions.  Our  work  formalizes  and  addresses 
the  algorithmic  problem  a  server  administrator  faces  when  designing  a  password 
composition  policy;  we  ask: 

In  what  settings  can  the  information  about  the  users'  preferences  over  pass¬ 
words  allow  us  to  design  a  password  composition  policy  that  is  guaranteed 
to  induce  a  password  distribution  as  close  to  uniform  as  possible? 

We  wish  to  stress  at  this  point  that  in  this  chapter  we  do  not  take  a  crypto¬ 
graphic  approach  to  the  problem:  we  do  not  design  a  protocol  aimed  at  amplifying 
a  password's  strength,  nor  do  we  rely  on  standard  cryptographic  assumptions  or 
techniques  in  designing  our  password  composition  policies.  Single-factor  authen¬ 
tication  does  not  defend  against  an  attacker  who  learns  about  the  most  probable 
password  from  an  external  source.  Furthermore,  because  password  systems  often 
allow  users  multiple  attempts  in  entering  their  password,  an  attacker  can  make  a 
small  number  of  guesses  with  impunity.  Therefore,  we  instead  focus  on  the  de¬ 
sign  and  analysis  of  algorithms  for  optimizing  the  password  composition  policy's 
induced  distribution  over  passwords,  and  in  our  theoretical  results  compare  the 
performance  of  our  algorithm  to  the  optimal  policy  among  exponentially  many 
potential  policies  in  the  worst  case. 


5.1.1  Our  Model 

We  study  the  algorithmic  problem  of  optimizing  password  composition  policies 
along  multiple  dimensions:  the  goal,  the  user  model,  and  the  policy  structure. 

Goal.  We  focus  on  designing  a  policy  that  maximizes  the  minimum-entropy  of  the 
resulting  password  distribution.  Specifically,  we  assume  the  server  deals  with  n 
users,  each  picking  a  password  from  some  space  of  passwords  *P  that  respects  the 
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server's  password  composition  policy.  These  n  passwords  form  a  distribution  over 
the  domain  of  all  allowed  passwords  and  our  goal  is  to  minimize  the  probability 
of  the  most  likely  password.  This  is  a  natural  goal  (see  Section  5.7),  as  opposed 
to  maximizing  the  Shannon-entropy  of  the  distribution,  which  for  example  is 
still  high  even  if  half  the  people  choose  the  same  password  and  the  other  half 
choose  a  password  uniformly  at  random  from  P.  From  a  security  standpoint,  the 
minimum  entropy  represents  the  fraction  of  accounts  that  could  be  compromised 
in  one  guess.  For  example,  an  adversary  would  be  able  to  crack  0.9%  of  RockYou 
passwords  [92]  with  only  one  guess.  Alternatively,  should  the  attacker  attempt  to 
break  into  only  one  account,  the  minimum  entropy  represents  the  likelihood  that 
the  account  is  compromised  on  the  first  guess.  We  also  consider  a  slightly  stronger 
goal  of  minimizing  the  fraction  of  accounts  that  could  be  compromised  using  k 
guesses,  that  is,  the  overall  probability  of  the  k  most  likely  passwords  [45]. 

User  model.  We  consider  two  models  for  how  users  select  passwords  when 
presented  with  a  password  composition  policy. 

In  the  ranking  model,  each  user  has  an  implicit  ranking  over  passwords,  from 
the  most  preferred  to  the  least  preferred.  Given  a  password  policy,  each  user 
selects  the  highest-ranking  password  among  those  allowed  by  the  policy.  There 
is  a  distribution  over  the  space  of  rankings  that  determines  the  fraction  of  users 
with  each  possible  ranking.  Note  that  for  any  password  composition  policy,  such 
a  distribution  over  rankings  induces  a  distribution  over  the  most  preferred  allowed 
passwords. 

In  the  normalization  model,  there  is  a  distribution  D  over  the  space  of  all  pass¬ 
words.  This  distribution  tells  us  the  likelihood  that  an  unrestricted  user  would 
select  a  given  password.  Given  a  password  composition  policy,  D  induces  a  new 
distribution  over  the  allowed  passwords  (which  can  be  obtained  by  normalizing 
the  probabilities  under  D  of  the  allowed  passwords).  When  we  ban  a  password 
the  fraction  of  users  that  prefer  each  allowed  password  grows;  the  natural  inter¬ 
pretation  is  that  users  who  preferred  an  allowed  password  still  use  that  password, 
but  users  who  preferred  a  banned  password  are  redistributed  among  the  allowed 
passwords  according  to  the  induced  distribution. 

As  we  show,  the  normalization  model  is  strictly  more  restrictive  than  the  rank¬ 
ing  model:  any  distribution  in  the  normalization  model  can  be  simulated  in  the 
ranking  model,  but  there  exist  hardness  results  for  the  ranking  model  that  do  not 
hold  for  the  normalization  model. 

Policy  structure.  We  consider  the  best  policy  that  is  restricted  to  manipulation  of  a 
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given  set  of  rules  —  each  rule  is  simply  a  predefined  subset  of  potential  passwords. 
These  rules  are  given  to  us  as  part  of  the  problem  (see  Section  5.7  for  a  discussion  of 
this  point).  If  we  interpret  a  rule  as  a  subset  of  banned  passwords  (e.g.,  passwords 
shorter  than  seven  characters),  its  complement  (e.g.,  passwords  of  at  least  seven 
characters)  can  be  interpreted  as  a  subset  of  allowed  passwords.  As  such,  when  we 
take  the  union  of  rules  we  get  either  a  set  of  banned  passwords  ( negative  rules )  or 
allowed  passwords  ( positive  rules);  this  is  our  password  composition  policy.  While 
the  distinction  between  the  two  cases  may  at  first  seem  a  mere  technicality,  it  is 
in  fact  quite  significant  due  to  the  following  observation.  If  we  ban  the  union  of 
rules  then  in  order  to  ban  a  password  that  was  picked  by  too  many  users,  we  may 
ban  any  rule  that  contains  this  password.  In  contrast,  if  we  allow  a  union  of  rules 
then  in  order  to  ban  this  password  we  must  not  allow  any  rule  that  contains  it.  In 
other  words,  when  our  goal  is  to  discard  a  password  in  the  negative  rules  setting, 
we  have  multiple  ways  to  do  so.  When  our  goal  is  to  discard  a  password  in  the 
positive  rules  setting,  we  have  only  one  way  to  do  so  —  excluding  all  rules  that 
allow  this  password.  As  we  shall  see,  this  seemingly  small  difference  leads  to  a 
clear  separation  between  the  two  scenarios  in  terms  of  the  complexity  of  designing 
optimal  policies. 

We  pay  special  attention  to  the  case  where  each  password  has  its  own  singleton 
rule.  In  this  setting,  a  policy  can  be  interpreted  as  a  "blacklist"  of  banned  pass¬ 
words  that  do  not  necessarily  share  common  characteristics.  Note  that  when  each 
password  has  its  own  singleton  rule,  it  does  not  matter  whether  these  rules  are 
positive  or  negative. 


5.1.2  Our  Results 

As  we  noted  above,  a  password  composition  policy  induces  a  distribution  over 
most  preferred  passwords  (in  both  user  models).  We  study  algorithms  that  sample 
these  distributions  —  algorithms  that  repeatedly  query  random  users  and  ask 
them  to  choose  a  password  constrained  by  some  policy,  and  then  output  the  a 
good  policy  for  the  empirical  sample  of  users.  Our  goal  is  therefore  twofold:  (i)  to 
show  that  having  sufficiently  many  samples  (i.e.,  sufficiently  many  users  queried) 
guarantees  that  w.h.p  the  best  policy  for  the  empirical  sample  is  good  for  all  users; 
and  (ii)  exhibit  algorithms  that  find  an  optimal  (or  close-to-optimal)  policy  for  a 
given  sample.  Clearly,  we  want  our  sample  size  to  be  "small".  In  particular,  since 
the  size  of  the  space  of  all  passwords  P  —  which  we  denote  by  N  —  is  typically 
very  large  (e.g.,  P  can  include  all  passwords  that  are  no  longer  than  32  ASCII 
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Table  5.1:  Summary  of  Complexity  Results. 


Ranking  Model 

Normalization  Model 

Constant  k 

Large  k 

Constant  k 

Large  k 

Singleton  rules 

P 

NP-Hard 
(Thm  13) 
APX-Hard 
w.  UGC 
(Thm  14) 

P 

P  (Thm  16) 

Positive  rules 

P  (Thm  12 ) 

NP-Hard 

P 

NP-Hard 
(Thm  18) 

Negative  rules 

n^-approx 
is  NP-hard 
(Thm  15) 

NP-Hard 

NP-Hard 
(Thm  17) 

NP-Hard 

characters),  we  wish  to  get  a  bound  on  the  sample  size  that  is  independent  of  N. 

For  the  ease  of  exposition,  we  discuss  goal  (ii)  before  goal  (i).  I.e.,  we  first 
(Sections  5.3  and  5.4)  study  the  problem  in  a  simpler  setting  where  the  preferences 
of  all  users  are  given  to  us  as  input;  and  only  then  (Section  5.5)  we  introduce  an 
algorithm  that  samples  users'  preferences.  Also  for  the  ease  of  exposition,  we 
first  discuss  algorithms  where  P  is  a  part  of  the  input,  so  they  are  allowed  to 
run  in  time  polynomial  in  N.  This  is  motivated  by  the  fact  that  computational 
complexity  of  problems  in  this  setting  informs  their  study  in  the  sampling  set¬ 
ting  —  it  is  hopeless  to  design  efficient  sampling  algorithms  for  problems  that 
are  computationally  hard.  (Efficient  sampling  algorithms  are  applicable  only  to 
computationally  tractable  problems.) 

Table  5.1  summarizes  our  complexity  results.  The  parameter  k  refers  to  our 
optimization  target:  minimizing  the  likelihood  of  the  k  most  likely  passwords. 
Some  results  are  direct  corollaries  of  others  —  using  the  fact  that  singleton  rules 
are  a  special  case  of  positive  rules  and  the  fact  that  the  normalization  model  is 
a  special  case  of  the  ranking  model  (see  Section  5.2).  Looking  at  the  table  one 
immediately  notices  a  clear  separation  between  negative  rules  and  positive  rules: 
optimization  using  the  latter  is  much  easier. 

We  therefore  focus  on  positive  rules  in  our  attempt  to  design  an  efficient  sam¬ 
pling  algorithm.  Our  main  result  is  the  best  one  could  hope  for  in  this  setting.  We 


109 


design  an  algorithm  that  works  in  the  more  general  ranking  model,  and  finds  a 
policy  whose  entropy  is  e-close  to  optimal  with  probability  1-6,  for  any  given 
e,  6  >  0.  The  required  number  of  samples  is  polynomial  in  1/e,  log(l/6),  and  the 
number  of  positive  rules  m.  We  can  assume  that  m  is  small,  because  each  rule 
corresponds  to  a  subset  of  passwords  that  can  be  concisely  described  to  users. 

These  results  can  be  applied  in  a  practical  setting,  and  we  show  this  through 
simulated  sampling  experiments  using  natural  rules  and  a  large  dataset  of  real 
passwords.  The  experimental  results  provide  evidence  for  the  difficulty  of  the 
negative  rules  setting:  we  search  all  combinations  of  rules  to  find  the  optimal 
policy  and  then  attempt  to  discover  this  policy  by  making  decisions  both  randomly 
and  with  a  heuristic.  In  the  negative  rules  setting,  neither  approach  succeeded 
at  finding  the  optimal  policy  after  hundreds  of  iterations  at  various  sample  sizes, 
and  average-case  performance  did  not  improve  with  sample  size.  In  the  positive 
rules  setting,  the  average-case  performance  of  our  efficient  algorithm  improved 
with  sample  size  and,  with  a  moderate  sample  size,  found  policies  that  were  either 
optimal  or  very  close  to  optimal. 


5.1.3  Related  Work 

It  has  been  repeatedly  demonstrated  that  users  tend  to  select  easily  guessable 
passwords  [39,  66,  92]  and  NIST  recommends  that  organizations  "should  also 
ensure  that  other  trivial  passwords  cannot  be  set,"  to  thwart  potential  attackers 
[133].  Unfortunately,  this  task  is  more  difficult  than  it  might  appear  at  first.  Policies 
were  initially  developed  without  empirical  data  to  support  them,  since  such  data 
was  not  available  to  policy  designers  [50].  When  hackers  leaked  the  RockYou 
dataset  to  the  Internet,  both  researchers  (and  attackers)  suddenly  had  access  to 
password  data,  leading  to  many  insights  into  true  passwords  [156].  However, 
recent  research  analyzing  leaked  datasets  from  non-English  speakers,  notably 
Hebrew  and  Chinese-language  websites,  shows  that  trivial  password  choices  can 
vary  between  contexts,  making  a  simple  blacklist  approach  ineffective  [40].  This 
means  that,  depending  on  the  context,  a  policy  based  on  leaked  password  data 
might  provide  no  security  guarantee,  and  it  has  ethical  issues  as  well. 

To  combat  this  issue,  researchers  have  turned  to  a  sampling  approach.  Bon- 
neau  2012  added  a  system  for  sampling  to  the  Yahoo!  password  infrastructure. 
This  system  allows  one  to  gain  empirical  data  about  the  frequency  distribution  of 
passwords  without  revealing  the  passwords  themselves.  Such  approaches  pro¬ 
vide  a  way  of  gathering  empirical  data  about  passwords  while  maintaining  the 
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anonymity  of  users.  Our  algorithms  could  be  used  in  conjunction  with  such  an 
infrastructure  to  optimize  policies. 

Komanduri  et  al.  2012  studied  the  effectiveness  of  several  basic  password  com¬ 
position  policies  by  using  Amazon's  Mechanical  Turk  to  conduct  a  large  scale  user 
study.  They  found  that  people  often  respond  to  restrictions  in  predictable  ways 
(e.g.,  if  the  password  needs  to  contain  a  capital  letter  users  might  tend  to  capitalize 
the  first  letter  of  a  password)  and  provide  very  general  recommendations  for  pass¬ 
word  composition  policies.  However,  no  theoretical  model  has  been  proposed  for 
studying  the  password  composition  problem. 

Schechter  et  al.  2010  suggest  using  a  popularity  oracle  to  prevent  individual 
passwords  that  have  been  used  too  frequently  from  being  selected  by  new  users. 
They  also  proposed  using  the  count-min  sketch  data  structure  [57]  to  build  such 
a  popularity  oracle.  Malone  and  Maher  2012  suggest  a  similar  system  using 
a  Metropolis-Hastings  scheme  to  force  an  approximately  uniform  distribution 
on  passwords.  Usability  results  on  the  effectiveness  of  dictionary  checks  [97] 
suggest  that  such  policies  would  be  very  frustrating  since  the  policy  is  hidden 
from  users  behind  an  oracle.  In  contrast,  we  seek  to  construct  optimal  policies 
from  combinations  of  rules  that  are  visible  to  the  user  and  can  be  described  in 
natural  language. 

This  consideration  of  users  is  important  to  electronic  commerce,  even  where 
security  is  concerned.  Florencio  and  Herley  2010  studied  the  economic  factors 
that  drive  institutions  to  adopt  strict  password  composition  policies  and  find  that 
they  often  value  the  user  experience  over  security.  An  e-mail  provider  like  Yahoo! 
might  adopt  simple  composition  policies  because  a  frustrated  user  could  easily 
switch  to  Gmail,  while  universities  are  free  to  adopt  strict  policies  because  users 
cannot  switch  easily. 


5.2  A  Model  of  Password  Composition  Policies 

We  use  P  to  denote  the  space  of  all  possible  passwords.  N  =  \P\  is  used  to  denote 
the  total  number  of  passwords.  We  denote  the  number  of  users  by  n. 

A  password  composition  policy  may  be  specified  in  terms  of  rules.  A  rule  is 
a  subset  of  passwords  R  Q  P  (e.g.,  the  set  of  all  passwords  with  more  than  seven 
characters).  We  use  R\, ...,  Rm  to  denote  a  list  of  rules  that  may  be  active  or  inactive. 
We  consider  two  schemes. 
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•  Positive  Rules:  A  password  zv  is  allowed  if  and  only  if  it  is  allowed  by  some 
active  positive  rule.  Formally,  a  password  composition  policy  $  =  U/eS^/ 
is  specified  by  a  set  S  c  [m]  =  {1 , m}  of  active  rules.  In  this  setting  rules 
should  consist  of  sets  of  passwords  which  we  expect  to  be  strong  (e.g.,  R ; 
might  be  the  set  of  all  passwords  longer  than  10  characters,  or  the  set  of 
all  passwords  that  use  both  upper  and  lowercase  letters,  or  the  set  of  all 
passwords  that  do  not  include  a  dictionary  word). 

•  Negative  Rules:  A  password  zv  is  allowed  if  and  only  if  it  is  not  contained 
in  any  active  negative  rule.  Formally,  a  solution  s  =  {zv  £  T*  \zv  £  [JieS  Rj\ 
is  given  by  a  subset  S  c  [m]  of  active  rules.  A  negative  rule  should  consist 
of  passwords  that  we  expect  to  be  weak  (e.g.,  Rj  might  be  the  set  of  all 
passwords  without  an  uppercase  letter,  or  the  set  of  all  passwords  shorter 
than  6  characters,  or  the  set  of  all  passwords  that  include  a  dictionary  word). 


We  also  consider  the  special  case  of  singleton  rules,  where  our  rules  are  {«M, . . . , 
{%).  Equivalently,  we  are  allowed  to  ban  or  allow  any  individual  password. 

We  use  Pr [zv  \  3\]  to  denote  the  probability  of  a  password  zv  given  composition 
policy  J{.  For  zv  t  JK  we  have  Pr [zv  \  NT\  =  0.  Given  a  set  W  c  y|  we  will  also  use 
Pr[ W 1 3\\  =  Yjwew  Pr[w  |  jA\.  We  use  p  (k, NT)  =  maxWc^:|w|=i:  Pr[ W  |  NT\  to  denote  the 
probability  of  the  k  most  popular  passwords.  Intuitively,  p  ( k ,  NT)  represents  the 
probability  that  an  adversary  can  successfully  guess  a  password  using  k  attempts. 
To  avoid  cumbersome  notation  we  sometimes  use  p\  =  p  T\,NT)  to  denote  the 
probability  of  the  most  popular  password.  Similarly,  we  use  p2  (resp.,  pk)  to  denote 
the  probability  of  the  second  (resp.,  /c'th)  most  popular  password. 

We  consider  two  user  models  that  determine  how  users  choose  passwords 
under  a  given  password  composition  policy. 


•  The  ranking  model:  A  ranking  is  simply  a  permutation  of  V ,  which  represents 
a  user's  password  preferences.  It  can  be  represented  using  an  ordered  list 
ti  =  Wit, ...,  zvim/,  user  i  prefers  password  zvh,  to  zvj+ij  for  all  j.  The  ranking 
naturally  tells  us  which  password  i  will  pick  under  any  composition  policy 
TR.  Specifically,  i  will  use  password  zo^j  =  Z0jri  where  j  =  argminft :  zvt,i  6  &}■ 
Given  a  distribution  D  over  rankings,  we  have 

Pr  [zv  \7l]  =  Pr^  [zv^j  =  zv]  . 
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•  The  normalization  model:  Let  D  be  an  initial  distribution  over  P,  and  let 
Pr  [zv]  =  Prt^£)  [zv  =  x].  If  we  select  the  composition  policy  then  the  proba¬ 
bilities  of  all  zv  e  3K  are  simply  re-normalized  so  that 

Pr  \zv\ 

VzveP,Jlc  p,Fr  [zv  Jl\  =  — ^  . 

Pr  [3\\ 

Clearly  it  holds  for  both  models  that  the  probability  of  an  allowed  password 
monotonically  increases  as  one  bans  more  passwords.  Formally,  for  all  zv  £  SR  and 
B  c  P  such  that  zv  iB  we  have 

Pr  [zv  XR\  <  Pr  [zv  \fi\B]  .  (5.1) 

Another  important  observation  is  that  for  our  purposes  the  ranking  model  is 
more  general  than  the  normalization  model.  Indeed,  we  argue  that  a  distribution 
D  over  passwords  in  the  normalization  model  induces  an  equivalent  distribution 
over  rankings.  To  generate  the  most  highly  ranked  password,  draw  a  password 
W\  from  T).  Next,  let  =  P  \  [zv i},  and  draw  the  next  most  preferred  password 
zv 2,  where  zx>i  =  zv  with  probability  Pr[zt>  In  the  following  round  we  ban  zv 2  to 
obtain  a  policy  and  so  on,  until  all  passwords  have  been  banned. 

Given  k  £  IN,  our  goal  is  to  find  S  c  [m\  such  that  p  (k,  k/\s)  <  p  (k,  Afy)  for 
all  S'  c  [m].  When  k  =  1  this  goal  is  equivalent  to  maximizing  the  minimum 
entropy.  If  p  (krjf ls)  <  c  ■  p  (k,  31$’)  +  e  for  all  S'  c  [m]  then  we  say  that  S  is  a  ( c,e )- 
approximation.  To  simplify  notation  we  sometimes  use  c-approximation  instead 
of  (c,  0)-approximation. 


5.3  Ranking  Model:  Complexity  Results 

In  this  section  we  consider  the  complexity  of  finding  the  optimal  password  com¬ 
position  policy  in  the  more  general  ranking  model  when  the  organization  is  given 
complete  information  about  users'  preferences.  Specifically,  the  organization  is 
given  the  rankings  l\, ...,  ln  of  every  user. 

Our  first  result  is  for  the  positive  rules  setting.  Given  positive  rules  Ri,  ...,Rm 
we  show  that  p  (k,  s)  can  be  computed  efficiently  for  constant  values  of  k  (see  The¬ 
orem  12).  In  fact,  for  the  special  case  k  =  1  we  present  a  very  simple  algorithm  that 
suffices.  Both  algorithms  can  be  easily  extended  to  the  less  general  normalization 
model.  Our  algorithms  are  based  on  three  simple  ideas:  (1)  Reduced  Preference 
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Lists  —  each  preference  list  £,  can  be  efficiently  reduced  to  a  short  (length  <  m) 
preference  list  G  (2)  Guess  and  Check  —  start  by  guessing  the  'structure'  of  the 
optimal  solution  and  find  the  resulting  solution.  (3)  Iterative  Elimination  —  find 
the  most  popular  password  zv  and  eliminate  all  positive  rules  that  contain  w.  Our 
sampling  algorithms  are  based  on  the  same  core  ideas. 

Unfortunately,  the  picture  is  different  in  the  negative  rules  even  when  k  is  a 
constant.  Given  negative  rules  we  show  that  it  is  hard  to  even  n b3- 

approximate  p  (1,  Gls).  Also,  for  non-constant  values  of  k  we  show  that  it  is  hard 
to  compute  p(k,j? ls)  in  the  singleton  rules  setting,  which  immediately  implies 
hardness  in  both  the  positive  rules  setting  and  in  the  negative  rules  setting.  Given 
a  stronger  complexity  assumption  known  as  the  Unique  Games  Conjecture  [98] 
it  is  also  hard  to  c0-approximate  p  (k,£As)  in  the  singleton  rules  setting  for  some 
constant  c0.  However,  our  hardness  results  do  not  rule  out  the  possibility  of  a 
c-approximation  for  a  larger  constant  c. 


5.3.1  Positive  Rules:  Efficient  Algorithm  for  Constant  k 

We  first  show  that  p  (k,  1AS)  can  be  computed  efficiently  for  constant  values  of  k 
in  the  positive  rules  setting.  In  this  section  the  organization  is  given  positive 
rules  Ri, ...,  Rm  as  well  as  preference  lists  A,  — ,  £n-  We  assume  that  the  organization 
can  efficiently  query  the  preference  lists  (e.g.,  given  S  c  [m]  the  organization  can 
efficiently  find  £t  (1AS)  —  user  i's  preferred  password  given  policy  1AS). 

We  elaborate  on  the  key  algorithmic  ideas  listed  above.  First,  we  can  efficiently 
reduce  each  preference  list  £i  to  a  list  of  at  most  m  passwords  (Claim  4).  While 
the  reduced  list  c,  is  much  shorter  than  £t  it  is  still  sufficient  to  determine  user 
i's  preferred  password  given  policy  SAs  for  any  S  c  [m\.  We  use  P  to  denote  the 
reduced  space  of  potential  passwords. 

Claim  4.  Algorithm  5.1  makes  at  most  m  queries  to  £  and  m1  membership  queries  and 
outputs  a  reduced  preference  list  £  over  at  most  m  passwords  such  that  for  every  S  c  [m\ 
it  holds  that  l  {fA s)  =  £  (1AS). 

Proof.  Clearly,  the  algorithm's  main  loop  iterates  at  most  m  times  because  for  each 
i  we  eliminate  at  least  one  rule  (e.g.,  |S,+i|  <  |S,|),  so  the  bound  on  queries  and  the 
length  of  £  are  immediate.  (Because  we  assume  that  we  can  query  £  efficiently 
Algorithm  5.1  is  also  efficient.)  By  construction  we  have  £{S,)  =  £(Sj)  for  each  S/. 
Fix  any  S  c  [m\.  Let  S,-  be  such  that  S  c  S,  yet  S  S,+i  and  let  Wi  be  the  most 
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Algorithm  5.1  Reduce 
Input: 

Preference  List:  £ 

Positive  Rules:  Rlr ...,  Rm 

Initialize:  i  <—  0,  S0  <—  [m\,  £  <—  empty  ranking, 
while  Si  ^  0  do 

Let  zv  be  £  (3K si)- 

£  <—  {£,  zv)  >  'Append'  the  current  most  preferred  password  to  £ 

S/+1  <—  Si  \|/|  zv  e  R;j  >  Deactivate  all  rules  that  contain  zv 

i  < —  /  +  1 

return  £ 

preferred  word  in  £  out  of  all  words  in  [Jjes,  Rj  ■  If  It  Is  the  case  that  zvt  e  IJ  ;eS  R;/  then 
zvi  is  the  most  preferred  word  in  S  too  and  we're  done.  Otherwise,  zx>i  €  Uy6s,\s  Rj 
which  means  that  removing  the  set  {j  €  S,  :  zv,  €  R,}  creates  a  set  S;+i  s.t.  S  c  S!+1 , 
contradiction.  □ 

5.3.2  Special  Case  k  -  1 

For  the  special  case  k  =  1  the  simple  algorithm  IterativeElimination  (Algorithm  5.2) 
suffices.  The  basic  idea  is  very  simple:  iteratively  eliminate  the  most  popular 
password  zv  by  deactivating  all  positive  rules  that  contain  zv.  We  repeat  this 
process  until  no  passwords  remain.  We  claim  that  one  of  the  solutions  along  the 
way  was  the  optimal  solution. 

Algorithm  5.2  IterativeElimination 

Input: 

Preference  Lists:  £\, ...,  £n 
Positive  Rules:  Ri, ...,  Rm  c  P 

Initialize:  So  <—  [m\,  i  0 
while  Sj  ^  0  do 

zv  (S,)  <—  arg  max  {Pr  [w  \  Jls]  \  w  £  ^s, }  >  w  (Si)  is  most  popular  allowed 

pwd 

S/+1  Si  \|/|  zv  (Si)  G  R;}  >  Deactivate  all  rules  that  contain  zv  (Si) 

i  4 —  i  +  1 

return  S,>  where  i*  <—  arg  min,  p  (1,  ££lSi) 
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Theorem  11.  Algorithm  5.2  outputs  a  set  of  positive  rules  S  c  [m]  such  that 

VS'  c  [m] ,  p(l,&s)<  piles')  ■ 

Proof.  Let  T  denote  the  optimal  policy.  Clearly  if  T  =  [m\  then  our  algorithm 
returns  S*  =  T  because  that  is  the  first  set  we  try  Otherwise,  T  c  [m\.  Let  S  be  the 
last  set  our  algorithm  considers  that  has  the  property  that  T  c  S.  Again,  if  T  =  S, 
our  algorithm  returns  S.  Let  w(T )  be  the  most  popular  word  in  2At,  and  because 
of  optimality  Pr[w(T)  |  PA r]  <  Pr[w(S)  |  PAs\. 

Now,  because  we  modify  S  to  not  contain  T  in  the  next  iteration,  then  the  most 
popular  word  in  S,  i v(S)  has  to  belong  to  some  rule  Rj  where  j  £  T.  Therefore 
w(S)  £  1J jeT^jr  and  by  the  definition,  the  most  popular  word  in  SAr  satisfies 
Pr[w(T)  |  2At\  >  Pr[?e(S)  |  2AT\. 

But  observe,  because  w(S)  £  U  /er^y  we  must  have  that  w(S)  is  at  least  as 
popular  in  T.  Indeed,  if  t  is  a  preference  list  where  we  disallowed  P  \  IJ /eS  Rt 
and  the  most  preferred  word  is  w(S),  then  as  long  as  we  disallow  more  words 
but  keep  allowing  w(S)  the  word  w(S)  remains  at  the  top  of  the  list.  There¬ 
fore,  Pr[ze(S)  |  PAj\  >  Pr[rc(S)  |  PA s].  Combining  together  all  inequalities  we  get 
Pr[w(T)  |  PAt ]  =  Pr[?e(S)  |  PA$],  which  means  our  algorithm  returns  S*  =  S.  □ 


5.3.3  The  General  Case 

We  now  present  an  algorithm  "Guess  and  Check"  to  find  the  optimal  password 
composition  policy  for  any  constant  value  of  k.  Our  algorithm  starts  by  guessing 
what  the  optimal  solution  looks  like  (e.g.,  what  the  k  most  popular  passwords 
will  be  in  the  optimal  solution  and  what  the  probability  of  the  k'th  most  popular 
password  is).  There  are  at  most  {mn)°ik)  potential  solutions  to  brute-force  try  As 
we  show,  for  each  candidate  solution,  it  is  easy  to  figure  out  which  sets  must  be 
eliminated  using  the  iterative  elimination  idea  behind  Algorithm  5.2. 

Theorem  12.  Algorithm  5.3  runs  in  time  polynomial  in  nk,  rnk  and  outputs  a  set  of 
positive  rules  S  Q  [m]  of  positive  rides  such  that 


p(k,Jls)  <  p  (k,PAS') 


for  every  other  set  S'  c  [m\. 
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Algorithm  5.3  GuessAndCheck 

Input: 

Preference  Lists:  l\, ...,  ln 
Positive  Rules:  Ri, Rm  c  P 
Integer  k 

Initialize:  Candidates  <—  0  >  Candidate  Solutions 

for  i  =  1  — >  n  do 

$i  <—  Reduce 

P  <—  U/=i  >  Reduced  Password  Space 

for  all  (G,p)  with  G  QP  s.t.  |G|  =  k  and  p  G  {l/n,2/nr  ...,1}  do 

SG/P  <-  [m] 

while  Sg,p  ±  0  and  3zv  G  (p  \  G^  n  tAsG/P  s.t  Pr  [w  ^sG;P]  >  V  do 

SG,p  Sg,p  \  {]  I  w  G  R;}  >  Ban  w  because  it  is  inconsistent  with  guess 

if  Pr  [w  WSg,v]  ^  P  for  all  w  G  (^sG/P  \  G)  then 
Candidates  <—  Candidates  U  {Sg,p} 

return  arg  min (G,P)eCandidates  V  (/c/  ^sc,p) 

Proof.  It  is  evident  that  the  running  time  of  the  algorithm  is  poly (nk,  rnk)  since  we 
only  have  0((nm)k )  potential  solutions  to  try 

Let  s*  denote  an  optimal  solution  and  let  G*  denote  the  k  most  popular 
passwords  in  this  solution.  Suppose  we  start  with  the  correct  guess  (G  =  G* 
and  p  is  the  probability  of  the  /c'th  most  popular  password),  then  we  claim  that 
our  algorithm  must  produce  the  optimal  solution.  In  particular,  we  maintain  the 
invariant  that  CR s*  c  jRsC],  until  we  converge  to  the  optimal  solution.  Clearly,  this 
is  true  initially  —  before  we  have  eliminated  any  passwords. 

Suppose  that  the  invariant  holds  and  that  our  algorithm  bans  a  password 
w  G  P  \  G  by  deactivating  all  rules  in  Sc,p  that  contain  w.  Then  by  the  definition 

of  our  algorithm  we  must  have  Pr  |  w  ^sGp]  >  P-  If  zv  G  tAS'  then  by  Equation  (5.1) 
we  have 

Pr  [w  1 3K s<]  >  Pr  [w  |  ~tRsc,r]  >  V  > 

which  contradicts  the  choice  of  G.  Therefore  w  <£  s>,  so  all  rules  that  contain  it 

are  deactivated  in  ^RS'  and  the  invariant  still  holds.  By  definition  Algorithm  5.3 
terminates  when  every  password  zv  G  AScp  \  G  has  probability  at  most  p.  Because 
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our  invariant  still  holds  we  can  apply  Equation  (5.1)  again  to  get 


Pr  [G  |  yisG,p]  <  Pr  [G  |  fts-]  =  V  (K&S-)  ■ 


Hence,  AsCp  is  an  optimal  solution. 


□ 


5.3.4  Singleton  Rules:  Hardness  for  Large  k 

Now  we  turn  our  attention  to  the  problem  of  optimizing  p  (k,  PAf)  for  large  values  of 
k.  Theorem  13  says  that  unless  P  =  NP  no  polynomial  time  algorithm  can  compute 
p  (/c,  Up)  even  with  singleton  rules.  If  we  are  willing  to  make  the  Unique  Games 
Conjecture  (UGC)  [98]  then  it  is  hard  to  even  Co-approximate  p(k,3\ Is)  for  some 
constant  Co .  These  results  immediately  imply  hardness  in  both  the  positive  and 
negative  rules  setting  because  these  settings  are  a  generalization  of  the  singleton 
rules  setting. 

Theorem  13.  Unless  P  =  NP  there  is  no  po\y(k,n,N)-algorithm  that  gets  as  input  an 
arbitrary  set  ofn  preference-lists  l\, ...,  £n  over  P  and  an  integer  k,  and  outputs  the  optimal 
p(k,jA)  in  the  singleton  rides  setting. 

Proof.  We  prove  the  theorem  using  a  reduction  from  the  Vertex-Cover  problem. 
Given  a  graph  G  over  g  vertices  and  e  edges  and  an  integer  t,  we  first  define 

P  =  {wu  :  u  G  V(G)}  U  {wU/V  :  (u,v)  £  E(G)j 

and  observe  that  \P\  =  g+e.  We  also  construct  the  following  n  =  2e  preference-lists, 
where  for  every  edge  (u,  v)  e  E(G)  we  have  the  two  lists: 

Wur  V)u,v/  •  •  • 

VJd,  IVU/V, .  .  . 

where  the  choice  of  passwords  below  position  2  is  arbitrary,  but  both  rankings 
must  be  identical  from  position  2  onwards.  Finally,  we  set  k-g  +  e-  t-1. 

Given  a  policy  J\  c  P,  we  denote  all  banned  words  as  (B  =  P  \  3\.  We  denote 
by  Lg  as  the  set  of  words  that  at  least  one  user  ranks  first  after  banning  all  words 
in  8.  Observe,  L0  =  \wu  :  n  €  V(G)\.  Using  this  notation,  we  show  this  reduction 
indeed  proves  NP-hardness. 

First,  suppose  G  has  a  vertex  cover  C  of  size  <  t.  Then  by  banning  all  passwords 
8  =  {wv  :  v  G  C}  we  now  have  Ls  =  P\8, because  for  every  ( u ,  v )  £  E(G)  either  wu 


' U,V 


■'VM 
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or  wv  are  banned,  so  the  word  wU/V  appears  at  the  top  of  at  least  one  of  the  two  lists 
\tu,v, £v,u)-  Therefore,  the  n  preference-lists  induce  a  distribution  whose  support 
contains  g  +  e  —  \tB\  >  g  +  e  -  t  words,  thus  p(g  +  e  -  t  -  1,J{)  <  1. 

Conversely,  suppose  all  vertex  covers  of  G  are  of  size  at  least  t  +  1.  Let 
be  any  set  of  banned  words.  Clearly,  if  \!B\  >  t  +  1  then  the  distribution  induced 
by  the  n  preferences-lists  has  support  of  size  at  most  g  +  e  -  t  -  1,  which  means 
that  p(g  +  e  -  t  -  l,J7l)  =  1.  Otherwise,  \*B\  <  t,  and  we  denote  the  set  of  vertices 
C  =  {v  :  wv  G  £}.  Observe,  since  any  vertex  cover  of  G  must  contain  >  t  + 1  vertices, 
then  there  has  to  be  at  least  t  +  1  -  |C|  edges  that  C  does  not  cover  (since  we  can 
always  complete  C  to  a  vertex  cover  by  adding  one  vertex  from  each  uncovered 
edge).  Therefore,  there  have  to  be  at  least  t  +  1  -  |C|  words  that  do  not  appear  at 
the  top  of  any  preference  list.  We  conclude  that  the  distribution  induced  by  the  n 
preference-lists  has  a  support  of  size  at  most 

\Ls\  =  g  -  |C|  +  e  -  (t  +  1  -  |C|)  <  g  +  e  -  t  -  1 
thus  p(g  +  e  -  t  -  1,  TR)  -  1.  □ 

From  the  same  reduction  described  in  Theorem  13  we  get  l/GC-hardness  of 
approximation.  While  there  are  sub-exponential  time  algorithms  to  solve  the 
Unique  Games  problem  [20],  there  are  no  known  polynomial  time  algorithms. 
Many  famous  approximation  hardness  results  are  based  on  the  Unique  Games 
Conjecture  (e.g.,  2  -  e  hardness  for  vertex  cover  [99]).  Our  reduction  relies  on  a 
result  in  [22],  which  says  that  vertex  cover  is  hard  to  approximate  up  to  a  (say)  1.5- 
factor  even  on  bounded  degree  graphs.  Because  we  start  with  a  bounded  degree 
graph  we  can  argue  that  each  password  in  our  reduction  appears  at  the  top  of  at 
most  d  preference-lists  for  some  constant  d. 

Theorem  14.  There  exists  a  constant  c  >  1  such  that  it  is  UGC-hard  for  a  poly (n,N,k)- 
time  algorithm  to  c-approximate  the  optimal  p(k,Jl)  in  the  singleton  rides  setting  and  the 
rankings  model. 

Proof  of  Theorem  14.  We  begin  with  a  construction  of  a  bounded  degree  graph 
which  is  hard  approximate  up  to  a  (say)  1.5-factor.  As  shown  in  [22],  for  every 
constant  d  there  exists  a  family  of  d-regular  graphs  for  which  it  is  UGC-hard  to 
determine  whether  there  exists  a  vertex  cover  of  size  t,  or  all  vertex-covers  have 
size  at  least  (2  -  0(loglog(d)/  log(d))  -  e)  t.  Fixing  d  to  be  a  large  enough  constant 
such  that  this  factor  is  >  1.5,  we  now  reduce  this  family  of  instances  to  a  password 
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problem  using  the  exact  same  construction  as  in  the  proof  of  Theorem  13,  with  the 
exception  that  we  set  k  =  g  +  e  -  (1.5  -  e)t. 

Observe,  for  this  family  of  instances,  e  =  0(g)  so  \P\  =  0(g),  but  also  the  size  of 
the  optimal  vertex-cover  has  to  be  ©(g)  (at  most  g  and  at  least  g/d).  Furthermore, 
each  password  appears  at  the  top  of  at  most  d  preference-lists.  Therefore,  by 
allowing  and  banning  B  =  P\& l,  we  not  only  have  a  distribution  whose 
support  is  of  size  |LS|,  but  it  also  holds  that  the  probability  of  each  word  in  Ls  is 
□(1/11*1)- 

Therefore,  if  the  graph  has  a  vertex-cover  C  of  size  t,  then  by  banning  all 
words  B  =  {wu  :  u  £  C}  we  have  that  the  n  preference-lists  induce  a  distribution 
over  \L<b\  >  g  +  e  —  t.  Since  we  set  k  =  g  +  e  -  (1.5  -  e)t  we  have  that  the  set  of 
most  uncommon  passwords  contain  at  least  (0.5  -  e)t  =  Q(|Lg|)  words,  each  with 
Q(1/|LS|)  probability,  thus  p(k,  Bi)  =  1  -  Q(l).  (And,  in  particular,  for  the  optimal 
policy  JP  we  have  p(k,l? T)  =  1  -  0(1).) 

In  contrast,  applying  the  same  argument  from  the  proof  of  Theorem  13,  we 
have  that  if  G  has  all  vertex-covers  of  size  >  (1.5  -  e)t  then  p(k,B\)  =  1.  The 
0(l)-hardness  of  approximation  follows.  □ 


5.3.5  Negative  Rules:  Hardness  of  Approximation  for  k  -  1 

We  next  turn  to  negative  rules,  where  we  show  that  the  problem  is  extremely 
difficult  even  for  k  —  1. 


Theorem  15.  Let  e  >  0.  Unless  P  =  NP  there  is  no  polynomial  time  algorithm  (in 
N,n,m)  that  approximates  minSc[w/]  p(l,  l?ls)  1°  a  factor  of  n1/3~e  in  the  negative  rules 
setting  and  the  rankings  model. 


Proof  of  Theorem  15.  Fix  e  >  0.  Our  reduction  is  from  the  Max-Independent-Set 
problem,  which  is  known  to  be  hard  to  approximate  up  to  a  factor  of  n1-e  [90].  We 
are  given  a  graph  G  with  g  vertices  and  e  edges,  and  we  must  determine  whether 
the  size  of  G's  largest  independent  set  is  g1_e  or  ge . 

Given  a  Max-Independent-Set  instance,  we  denote  K  =  ge  and  create  the  fol¬ 
lowing  password  policy  instance,  which  is  composed  out  of  the  following  set  of 
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possible  words: 


P 


{Ai,...,Ak}  U  {B1/ 


u 


U 

V(u,o)e£(G) 


••,cyu{  c;v 


u 


..., DV'ifjrg}  U  {DV/j,i ,i,  •••,  Dv ,j,i,g}) 


\veV(G),l<i<j<K 


U{X} 


We  now  describe  the  n  =  g  +  ge  +  g2(^)  <  g3  +  g1+2e  users'  preference-lists. 
We  start  with  the  g  rankings  specified  in  Table  5.2a.  We  continue  with  ge  more 
rankings,  where  for  each  edge  (u,  v )  G  E(G)  we  add  g  more  rankings,  as  detailed  in 
Table  5.2b.  Lastly,  we  add  g2(^)  more  rankings,  where  for  each  triple  (v,  i,  j)  where 
v  is  a  vertex  of  G  and  i  +  ]  G  [/<]  we  add  g  rankings,  as  detailed  in  Table  5.2c. 
(Observe,  the  tables  detail  the  first  few  words  in  each  list,  then  end  with  " ..." 
mark,  which  indicates  that  from  that  point  on  the  remaining  words  may  appear  in 
any  order.) 

Finally,  we  detail  our  rules.  For  every  i  G  [/<]  and  u  G  V(G)  we  have  a  rule 
which  roughly  corresponds  to  deciding  that  u  is  a  member  of  the  independent  set: 

{ V :  (u,v)eE(G)} 


Rn 


(A)u  U  {C'j.q, . c:j 


U 


Our  analysis  now  follows  from  a  series  of  observations. 

Observation  1:  If  we  do  not  ban  all  of  the  passwords  Alr...,AK  then  p1  >  g/n. 
Therefore,  for  every  i,  we  must  choose  at  least  one  of  the  rules  {R„/(|  to  activate,  or 
else  we  have  that  p\  >  g/n 

Observation  2:  If  we  ban  Cvul,...,  Q’  ?  and  C“  v . . . ,  C“  g  then  we  must  have  p\  >  g/n. 
Therefore,  for  any  i  y  j  it  must  not  be  the  case  that  we  ban  Ru  i  and  RV  j  where 
(u,  v)  G  E(G),  or  else  we  have  that  p1  >  g/n. 

Observation  3:  If  we  ban  Dv^p , ,  DV)iij/g/  and  D7W/(/i,  . . . ,  DV/hhg  then  p\  >  g/n. 
Therefore,  for  any  i  ±  j  it  must  not  be  the  case  that  we  ban  RUJ  and  RUrj,  or  else  we 
have  that  p\  >  g/n. 

These  observations  lead  us  to  the  following  conclusion.  If  G  contains  an  inde¬ 
pendent  set  V\, ...,  vK  of  size  K,  then  activating  the  rules  {RVlr i,  RV2i2,  ■  ■  ■ ,  R<>K:k\  leads 
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Table  5.2:  Rankings  used  in  the  proof  of  Theorem  15. 


A 

4 

Ai 

Ai 

A2 

A2 

Ak 

Ak 

Bi 

1 

/ 

L/u,v,g 

r'v 

C“ 

Z’,1 

C" 

X 

X 

(b)  Type  2 


(a)  Type  1 


f  ■  . 

Dv,i,j,g 

D v,j,iA 

Dv,j,i,g 

X 

X 

(c)  Type  3 
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to  a  setting  where  each  truncated  ranking  begins  with  a  unique  word,  so  p\  =  1  /ft. 
In  contrast,  if  G  does  not  have  an  independent  set  of  size  K,  then  p\  =  g/n.  Since 
n  =  0(g3)  we  have  an  Q(n1^3)-hardness  of  approximation.  Observe  also  that  the 
number  of  total  words  is  N  =  K  +  g  +  2 eg  +  g2K(K  -  1)  +  1  =  0(g 3)  =  0(n)  so  it  is 
also  hard  to  approximate  the  problem  to  a  factor  of  Q (N1/3).  □ 


5.4  Normalization  Model:  Complexity  Results 

In  this  section  we  focus  on  complexity  results  for  the  normalization  model.  Here 
the  structure  of  the  input  to  our  problem  is  a  bit  different:  For  each  password 
w  G  P  we  are  given  the  probability  Pr[rc]  that  w  is  selected  by  a  random  user  when 
l  —  P.  Note  that  now  we  can  give  the  distribution  explicitly  because  it  requires 
N  numbers  (whereas  a  distribution  over  rankings  requires  N\  numbers).  This 
distribution  induces  a  distribution  over  P  for  any  password  composition  policy  Jd 
by  normalizing  probabilities,  as  explained  in  Section  5.2. 

Because  the  normalization  model  is  a  special  case  of  the  ranking  model  our 
algorithms  for  the  ranking  model  can  also  be  applied  in  the  normalization  model. 
The  question  is  whether  or  not  the  hardness  results  carry  over. 

We  first  consider  the  singleton  rules  setting  with  large  k,  and  show  that  that 
we  can  compute  argmin^c pp  (/c,  Jd)  in  polynomial  time  in  N  (Theorem  16).  This 
result  separates  the  normalization  model  from  the  ranking  model  (e.g.,  compare 
Theorems  16  and  13).  However,  it  does  not  extend  to  the  positive  rules  setting.  In 
fact,  we  show  that  optimizing  p  ( k ,  Jdg)  is  NP-Hard  when  k  is  a  parameter  (Theorem 
18). 

With  negative  rules  Ri,...,Rm  we  show  that  it  is  hard  to  c0-approximate 
argmaxsc[m]P  (1/^ds)  (Theorem  17).  However,  we  cannot  rule  out  the  possibility 
of  an  efficient  c-approximation  algorithm  for  some  constant  c  in  the  normaliza¬ 
tion  model  (recall  that  Theorem  15  ruled  out  the  possibility  of  a  c-approximation 
algorithm  in  the  ranking  model  for  any  c). 


5.4.1  Singleton  Rules:  Efficient  Algorithm  for  large  k 

We  present  SortAndOptimize  —  an  efficient  algorithm  to  optimize  p  (k,  Jd)  in  the 
singleton  rules  setting  for  any  value  of  k.  The  key  intuition  behind  our  algorithm 
is  that  if  W\  £  P  is  the  most  likely  password  then  W\  will  remain  the  most  likely 
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allowed  password  unless  we  ban  it  —  a  property  that  does  not  hold  in  the  rankings 
model.  A  formal  proof  of  Theorem  16  can  be  found  in  Appendix  9.1. 

Theorem  16.  For  every  k,  Algorithm  5.4  computes  arg  miny[  p  (k,  FA)  in  the  singleton 
rules  setting  of  the  normalized  probabilities  model,  in  time  0(NTog(N)). 


Algorithm  5.4  SortAndOptimize 

Input: 

Password  space  P  and  a  probability  distribution  over  P. 

Integer  k. 

Sort  the  words  in  P  from  highest  to  lowest  probability,  W\,  lib,  ■  ■  ■ , 
return  the  set  TF\ ,■  =  {iVj  :  j  >  /'},  where  i  minimizes  the  ratio 


p(k,  JF) 


Li<j<i+kMwj\ 

Lj>i  hr  [re,] 


5.4.2  Negative  Rules:  Hardness  for  k  -  1 

We  next  prove  an  inapproximability  result  that  is  somewhat  weaker  than  the  one 
that  we  obtained  for  the  more  general  ranking  model. 

Theorem  17.  There  exists  some  constant  c0  >  1  such  that  unless  NP  =  BPP  no  poly¬ 
nomial  time  algorithm  (in  n,N,m)  can  CQ-approximate  minSc[m]  p  (1,  Pis)  m  the  negative 
rides  setting  and  the  normalization  model. 

We  will  require  the  following  construction;  the  proof  is  given  in  Appendix  9.1. 

Lemma  4.  Fix  m  and  s  such  that  m  >  s.  There  exists  a  domain  D  of  size  0(s2  log(m))  and 
a  family  ofm  sets,  F \,  F2, . . . ,  Fm  c  D,  such  that  each  set  in  the  family  contains  elements, 
and  for  every  C  c  [m]  of  size  |C|  <  s,  we  have  that  the  size  of  the  union  |(JieC  ^'1  > 

This  domain  can  be  constructed  in  randomized  poly(s,  m)  time. 

That  is,  each  set  in  this  family  contains  exactly  the  same  fraction  of  the  domain, 
and  furthermore  —  any  union  of  |C|  <  s  sets  has  the  property  that  its  cardinality  is 
proportional  to  Q(|C|)|F,|. 
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Proof  of  Theorem  17.  We  reduce  from  Set-Cover  —  one  of  the  classic  NP- Complete 
problems  [94],  We  are  given  sets  S\,...,Sm  c  U,  universe  U  =  {l,...,g},  and  an 
integer  t  <  m,  and  we  are  asked  whether  there  is  a  set  C  c  [m]  of  size  <  t  such  that 

LI  =  U  ieCS,- 

It  is  a  known  fact  that  there  exist  Set-Cover  instances,  with  ( g ,  m,  t)  all  polynomi- 
ally  dependent  of  each  other,  that  are  hard  to  approximate  to  a  factor  of  c  in  n  [17]. 
That  is,  on  this  particular  family  of  instances,  it  is  NP-hard  to  distinguish  whether 
there  exists  a  cover  of  size  t  or  all  covers  have  size  (1  -  e)c  ■  f  Inn. 

We  now  describe  the  reduction.  Given  a  {g,  m,  f)-Set  Cover  instance,  we  set 
s  -  c  ■  t  In  g  —  &(t  In  t)  and  construct  a  domain  D  and  m  sets  F Pr, ... ,  Fm  c  D  as  in 
Lemma  4.  We  then  create  the  following  password-banning  instance.  First  P  is  the 
union  of  D  with  additional  disjoint  g  words  denoted  W\, ...,  zvg.  Now,  for  each  set  S,- 
in  the  Set-Cover  we  add  a  rule  P,  where  R,  =  {w/l/es,  U  Ft.  Finally,  we  set  the  words' 
probabilities  as  follows.  Fixing  some  arbitrarily  small  5  >  0,  we  set  for  every  i  the 
probability  Pr[?n,]  =  kp  and  for  every  x  £  D  we  set  the  probability  Pr[x]  =  jj. 

Without  loss  of  generality  we  can  assume  that  |D|  >  lOOg  (because,  for  example, 
we  can  take  100y  copies  of  the  original  D).  Therefore,  any  policy  that  bans  all  of 
[zvi,  zv2, . . .  wg]  yet  leaves  a  constant  (say  >  1/10)  fraction  of  D  has  pi  <  10/|D|, 
whereas  any  policy  that  keeps  even  one  of  the  words  in  {wi,w2,  ■  ■  ■  ,wg]  has 
pi  >  1/(2 g).  Therefore,  if  the  Set-Cover  instance  has  a  cover  of  size  <  s  =  0(f  In  y), 
then  a  Co-approximation  of  the  optimal  banning-policy  must  find  a  cover  for 
[wi,  ZV2, . .  • ,  w^}.  We  will  assume  from  now  on  that  our  Set-Cover  instance  is  such 
that  it  has  a  cover  of  size  <  s.  (Indeed,  if  s  >  t  log(f)  then  the  instance  is  no  longer 
NP-hard,  since  the  greedy  algorithm  must  return  a  cover  of  size  >  t  log(f)  which 
causes  us  to  deduce  that  the  optimal  cover  must  have  size  >  t .) 

So  now,  suppose  our  Set-Cover  instance  has  a  cover  of  size  t.  Then  the  re¬ 
spective  union  of  rules  bans  every  password  in  {zv\,  ith,  •  •  • ,  c%}  and  no  more  than 
2;|D|  words  of  D  (we  get  an  upper  bound  by  multiplying  the  size  of  each  set  by 

the  number  of  sets).  This  leaves  a  collection  of  (l  -  \D\  equally  likely  words, 

so  pi  -  (l  -  |D|_1  =  (1  -  0(1/  log(g)))_1|D|_1  =  (1  +  o(l))|D|_1.  In  contrast,  if 

all  covers  of  our  Set-Cover  instance  have  size  s'  >  c  ■  t  ln(y)  (where,  because  we 
assume  some  cover  has  size  <  s,  we  have  s'  <  s,)  then  any  collection  of  rules  that 
bans  all  words  in  {wi,u>2, . .  ■  ,wg}  must  also  ban  at  least  |^|D|  words  out  of  D.  This 
leaves  at  most  (1  -  Q(1))|D|  words  in  D  and  so  p\  >  (1  —  Q(l))~i|D|_1.  Denoting 
the  latter  constant  as  c~l,  we  have  that  any  c0  -  e  approximation  of  the  optimal 
banning-policy  indicates  the  existence  of  a  cover  of  cardinality  <  c  ■  t  ln(y).  □ 
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5.4.3  Positive  Rules:  Hardness  of  Approximation  for  Large  k 

While  we  can  show  that  it  is  possible  to  optimize  p  (k,  A()  in  the  singleton  rules 
setting  our  result  does  not  extend  to  the  more  general  positive  rules  setting.  We 
are  able  to  show  that  it  is  NP-Hard  to  compute  argminSc [m\P  (k,3\s).  However, 
our  reduction  does  not  imply  approximation  hardness  so  we  cannot  rule  out  the 
existence  of  a  PTAS. 

Theorem  18.  Unless  P  =  NP  there  is  no  polynomial  time  algorithm  (in  N,m,n)  which 
outputs  arg  minsc[,),|  p  (k,U Is)  in  the  positive  rules  setting  and  the  normalization  model. 

The  theorem's  proof  is  relegated  to  Appendix  9.1. 


5.5  Efficient  Sampling  Algorithms 

In  a  sense,  our  complexity  results  are  not  "realistic",  and  in  particular  in  the 
ranking  model  our  positive  algorithmic  results  assume  access  to  each  user's  full 
preferences.  Moreover,  some  algorithms  are  allowed  to  run  in  polynomial  time 
in  the  number  of  passwords  N,  which  can  be  huge.  In  this  section  we  use  our 
complexity  results  as  guidelines  in  the  design  of  practical  sampling  algorithms. 

In  more  detail,  we  are  given  oracle  access  to  rules  Ri,...,Rm  (e.g.,  we  can  ask 
whether  or  not  a  password  w  G  R,)  and  we  are  allowed  to  sample  from  the 
distribution  induced  by  the  password  composition  policy  j^s  for  any  S  c  [;;/]. 
Less  formally,  a  sample  is  equivalent  to  asking  a  random  user  what  her  favorite 
password  is  given  the  current  policy. 

We  will  work  in  the  more  general  ranking  model,  so  there  is  essentially  only 
one  positive  result  we  can  build  on:  Theorem  12,  a  polynomial  time  algorithm 
for  constant  k  in  the  positive  rules  setting.  When  adapting  this  algorithm  to 
the  sampling  setting,  we  cannot  expect  it  to  work  perfectly  due  to  the  inherent 
uncertainty  of  this  domain.  Instead  we  expect  the  algorithm  to  find  an  e-optimal 
password  composition  policy  with  probability  at  least  1-6,  for  any  given  e  and  6. 
Crucially,  the  number  of  samples  must  not  depend  on  the  number  of  passwords 
N,  and  must  have  a  polynomial  dependence  on  the  other  parameters. 

Formally,  we  let  S*  c  [m]  denote  the  optimal  collection  of  positive  rules  to  acti¬ 
vate  (for  all  S  c  [m\,  p  (1, <  p  (1, j^s))-  Our  goal  is  to  find  a  (1,  e)-approximation 
S  c  [m\  to  p  (l,Ul s'),  that  is,  S  such  that  p  (1,  <  p  (1,  JKs')  +  C  with  probability 

1-6. 
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We  first  present  Algorithm  5.5  that  achieves  our  goal  for  k  =  1;  this  algorithm 
is  an  adaptation  of  Algorithm  5.2. 

Algorithm  5.5  SampleAndEliminate 

Positive  Rules:  R\, ...,  Rm 
Input:  e,  6 

Initialize:  So  <—  [m],  i  <—  0 
while  Si  ±  0  do 

Sample:  Draw  samples  W\, ...,  zvs  according  to  the  distribution  Pr  [zv  |  J3s,] 

W  <-  {zvlr...,zvs} 

s w  <—  ||;  Wj  =  ze}  for  each  zv  eW. 

zv*  <—  arg  max  {su,  |  zy  G  W}  >  zv*  is  the  most  frequently  sampled  password 
pi  <—  s-f  >  pi  is  our  estimation  of  Pr  |  rr*:!  :ds  | 

if  pi  <  e/2  then  return  S,  >  The  current  solution  is  already  sufficiently  good 

else 

Si+ 1  <-  Si  -  {j\w*  e  Sj}  >  Deactivate  all  rules  that  contain  zv* 

i  ^i  +  1 

return  S;»  where  i*  =  arg  max  j pjj  <  . 

Theorem  19.  Algorithm  5.5  runs  in  polynomial  time  in  m,  1  /e,  1  /b,  requires 
O  (m  log  (m/6)  / e 2)  samples  and  returns  a  (1,  e)-approximation  S  c  {1, m}  o/p  (1,  yts-) 
zvith  probability  at  least  1-6. 

Proof.  Let 

BAD i  =  |3w  G  Jts.  j  ~  Pr  [zv  \  JlSl]  >  e/2}  , 

denote  the  event  that  our  probability  estimates  are  off  during  iteration  i.  Claim  5 
bounds  the  probability  of  any  bad  event.  The  proof  of  Claim  5  can  be  found  in  the 
appendix.  The  proof  involves  bucketing  the  passwords  based  on  their  probability, 
applying  Chernoff  Bounds  to  upper  bound  the  probability  of  a  bad  estimate  for 
our  passwords  in  each  bucket,  and  repeatedly  applying  union  bounds. 

Claim  5.  Pr  [3 i,  BADj]  <  6  . 

For  the  rest  of  the  analysis  we  assume  that  no  bad  event  occurs.  Let  p*  = 
mingcjm]  p  (1,  PAf)  and  suppose  that  As.  c  ASi.  Clearly,  this  is  true  when  i  =  0.  If 

pi  >  e/2  +  p*  then  Pr  [zv*  \  >  Pr  [zv*  \  >  p*  so  that  zv*  t  As>.  Hence,  As.  c  AS;+1 
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and  the  property  is  maintained  for  at  least  one  more  iteration.  If  instead  <  e/2+p* 
then  we  have  pi-  <  pj  <  p*  +  e/ 2  so  for  each  w  e  we  have  Pr  [w  1 3K S;,]  <  p*  +  e. 

We  conclude  that  the  solution  Si-  is  a  (1,  ^-approximation. 


□ 


We  next  explain  how  to  extend  Algorithm  5.3  to  (1,  e)-approximate  the  optimal 
p(k,3\s)  for  any  constant  k. 

Theorem  20.  There  is  an  algorithm  which  runs  in  polynomial  time  (in  m,  1/e,  5),  takes 
a  polynomial  number  of  samples,  and  returns  a  (1,  e)-approximation  S  c  [m]  ofp  (k,  Yls*) 
with  probability  at  least  1-6. 


sketch.  To  extend  Algorithm  5.3  to  (1,  e)-approximate  p  (k,  Ah)  for  constant  k  we 
need  one  more  idea.  We  cannot  simply  obtain  a  reduced  password  space  P  by 
reducing  preference  lists  because  we  can  only  sample  from  our  distribution.  Notice 
that  for  any  S  c  [m]  such  that  i  e  S  we  have  Pr  [zv  \  ^  Pr  \zv  \  2R\i)\  so  to  obtain 

a  (1,  e)-approximation  it  is  sufficient  to  limit  our  attention  to  passwords  in  the 
following  set 

P  -  jw 


3i,  Pr 


w 


^\i\  ^ 


We  can  obtain  a  superset  of  P  by  sampling.  For  each  positive  rule  R,  we  draw  s 
independent  samples  from  the  distribution  and  set 


Ti  = 


£ 

s  2k 


Intuitively,  a  password  w  is  included  in  T,  if  and  only  if  our  estimated  proba¬ 
bility  is  sufficiently  large.  Let  T  =  (J;  P/-  F°r  a  sufficiently  large  sample  size 
s  =  O  ( poly  (m,  k,  1/e,  1/ 6))  we  can  apply  Chernoff  Bounds  to  argue  that  with  prob¬ 
ability  1  -  6  (1)  \T\  is  small,  i.e.,  O  ( poly  (m,  k,  1/e,  1/6)),  and  (2)  T  D  P.  □ 


5.6  Experiments 

To  demonstrate  how  our  ideas  could  apply  in  a  real-world  scenario,  we  simulated 
runs  of  Algorithm  5.5  by  sampling  with  replacement  from  the  RockYou  leaked 
password  set  [92].  The  set  contains  over  32  million  passwords  with  a  frequency 
distribution  similar  to  that  of  many  other  password  sets  [39].  Note  that  all  results 
presented  here  are  limited  by  the  dataset  and  assume  the  normalization  model. 
Working  in  the  normalization  model  is  crucial  because  we  cannot  ask  the  RockYou 
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users  for  their  preferred  password  under  a  specific  policy;  an  initial  distribution 
over  V  —  which  is  available  to  us  —  is  sufficient  though,  because  it  induces  a 
distribution  for  any  policy  J\. 

We  selected  21  positive  rules  that  mirror  commonly  used  password  compo¬ 
sition  rules  that  are  used  in  practice,  and  looked  at  sample  sizes  s  of  100,  500, 
1000,  5000,  and  10000.  The  rules  included  length  requirements,  character  class 
requirements,  combinations  of  requirements,  a  dictionary  check,  etc.  (See  Table 
5.3  in  Section  5.6.1  for  a  complete  listing  of  the  rules  we  selected.)  For  each  run 
with  a  particular  value  of  s,  the  algorithm  returns  a  policy  Jls  for  which  we  can 
measure  p  (1,  Ah)  in  the  original  dataset  and  compare  with  the  optimal  p  (1 ,  Ids- ), 
determined  from  running  Algorithm  5.2  on  the  original  dataset.  We  performed 
500  runs  for  each  of  the  five  values  of  s. 

To  gain  an  understanding  of  how  policies  based  on  negative  rules  perform, 
we  took  the  complement  of  the  21  positive  rules  selected  above  to  get  21  nega¬ 
tive  rules.  We  then  determined  the  optimal  negative  rules  policy  by  calculating 
S*  =  arg  minSc[m]  p  (1,  J?ls)  via  brute-force.  This  was  required  because  we  have 
no  equivalent  to  Algorithm  5.2  for  negative  rules.  With  this  baseline  in  hand, 
we  designed  two  naive  algorithms,  similar  in  spirit  to  Algorithm  5.5.  There  are 
multiple  ways  to  discard  a  password  in  the  negative  rules  setting,  and  one  algo¬ 
rithm  makes  this  decision  randomly  while  the  other  bans  the  smallest  subset  as 
determined  from  the  current  sample.  Again,  500  runs  were  performed  for  each 
s  e  {100, 500, 1000, 10000, 50000}. 

5.6.1  Experiment  Rules 

We  selected  rules  based  on  common  types  of  rules  used  in  constructing  password 
composition  policies,  e.g.,  the  policies  recommended  by  NIST  [50].  The  rules  we 
selected  are  shown  in  Table  5.3.  Positive  and  negative  forms  of  each  rule  are  shown. 
In  the  positive  rules  setting,  a  password  is  allowed  if  it  matches  any  positive  rule. 
In  the  negative  rules  setting,  a  password  is  banned  if  it  matches  any  negative  rule. 

The  dictionary  check  used  the  cracking  dictionary  from  openwall.com.  This 
dictionary  is  used  by  one  of  the  most  well-known  password  crackers,  John  the 
Ripper  [63].  Since  this  dictionary  contains  all  alphabetic  strings  up  to  size  3,  it 
was  pruned  to  only  include  entries  of  4  characters  or  more  for  the  "contains  a 
dictionary  word"  dictionary  check. 

Notice  that  for  some  groups  of  rules,  e.g.,  length  rules,  digit  rules,  etc.,  the 
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subsets  defined  by  these  rules  are  subsets  or  supersets  of  each  other.  For  example, 
if  the  positive  rule  "8  characters  or  more"  is  in  a  policy,  adding  the  "10  characters 
or  more"  rule  yields  the  same  policy  We  did  this  to  prevent  the  selection  of  overly 
complex  policies,  e.g.,  "8  characters"  OR  "11  characters"  OR  "12  characters"  OR 
"14  characters."  However,  we  also  selected  a  couple  of  "combination  rules"  to 
make  policies  more  interesting. 


5.6.2  Baselines 

We  examined  several  baselines  for  comparison  with  our  algorithm.  Table  5.4  shows 
these  baselines,  the  probability  of  the  most  frequent  password  in  the  resulting 
policy,  and  the  optimal  policy  as  a  union  or  intersection  of  rules  (for  clarity,  the 
complement  of  the  union  of  negative  rules  is  shown  as  the  intersection  of  positive 
rules). 

As  shown  in  Table  5.4  from  the  means  across  policies,  randomly  selecting  a 
policy  from  the  power  set  of  rules  can  be  worse  than  having  no  policy.  The  "one 
rule  maximum"  baseline  was  selected  because,  if  decided  based  on  sampling, 
only  m  distributions  need  be  sampled.  Our  efficient  algorithm  requires  the  same 
amount  of  sampling,  but  can  find  the  optimal  policy  over  S  c  \m\  rather  than 
S  e  {1, Also  of  interest  is  the  optimal  policy  with  negative  rules,  which  is 
over  3x  better  than  the  optimal  policy  with  positive  rules.  However,  as  shown  in 
the  following  section,  the  performance  of  our  sampling  algorithms  with  negative 
rules  was  far  worse  than  in  the  positive  rules  setting. 


5.6.3  Performance 

In  the  positive  rules  setting  (see  Table  5.5),  the  algorithm  performed  extremely 
well  even  at  moderate  sample  sizes.  The  average  policy  selected  with  s  =  500  was 
almost  lOx  better  than  having  no  policy.  At  s  =  1000,  the  optimal  policy  was  found 
10%  of  the  time  (50  out  of  500  times). 

In  the  negative  rules  setting  (see  Table  5.6),  however,  neither  algorithm  found 
the  optimal  policy.  The  "Ban  Smallest"  heuristic,  when  faced  with  a  choice  between 
multiple  subsets  that  contain  the  most  likely  password,  decides  to  ban  the  smallest 
available  subset,  disrupting  the  space  the  least.  This  might  seem  like  an  intuitively 
good  choice  but,  in  fact,  it  fails  to  find  a  better  policy  than  the  empty  set  at 
large  sample  sizes.  The  randomized  algorithm  does  better  (it  cannot  actually  do 
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worse)  but  still  has  much  worse  average  case  performance  than  using  our  efficient 
algorithm  with  positive  rules. 


5.7  Discussion 

We  conclude  this  chapter  by  discussing  some  key  points. 

Usability.  In  this  chapter  our  goal  was  to  optimize  the  security  of  a  password 
composition  policy.  However,  many  users  find  it  difficult  to  comply  with  all  of 
the  requirements  of  a  complicated  password  composition  policy.  Can  we  quantify 
the  usability  costs  of  a  password  composition  policy?  Can  we  characterize  the 
trade-off  between  security  and  usability  in  password  composition  policies?  Can 
we  find  the  optimal  password  composition  policy  subject  to  usability  constraints? 
Where  do  the  rules  comes  from?  Throughout  the  paper  we  have  assumed  that 

the  rules  (whether  positive  or  negative)  are  given  as  part  of  the  input;  it  is  not  up 
to  us  to  find  these  rules.  Our  experiments  indicate  that  a  collection  of  intuitive 
and  practical  rules  can  already  give  very  good  results  on  real  data.  However,  the 
question  of  deciding  which  rules  should  be  added  to  our  collection  is  outside  the 
scope  of  this  paper.  Much  like  the  problem  of  feature  selection,  it  is  an  interesting 
problem  with  real-life  implications,  which  we  suspect  will  be  very  difficult  in 
practice. 

Alternate  policy  goals.  Our  goal  [45]  has  been  to  minimize  p  (k,j? ls).  Intuitively, 
p  ( k ,  Ms)  represents  the  probability  that  an  adversary  with  no  background  knowl¬ 
edge  can  successfully  guess  the  password  of  a  randomly  selected  user  in  k  tries.  A 
small  value  of  k  optimizes  security  guarantees  against  an  online  guessing  attack  in 
which  the  adversary  is  locked  out  after  k  failed  attempts  to  login.  A  much  larger 
value  of  k  (e.g.,  232)  is  necessary  to  optimize  security  against  an  adversary  who  has 
obtained  the  cryptographic  hash  of  a  password  and  is  able  to  mount  a  brute-force 
dictionary  attack  [136].  However,  the  optimal  solutions  for  p  (1,  Ms)  and  p  (232,  Ms) 
might  be  completely  different.  One  stronger  goal  that  we  might  hope  to  achieve  is 
to  optimize  both  goals  simultaneously.  More  formally,  can  we  find  a  policy  S  c  [m] 
such  that  for  every  S'  c  [m]  and  every  k  <  N  we  have  p  (k,  Ms)  <  c  ■  p  (/c,Ms')  for 
some  constant  cl  Unfortunately,  the  answer  is  no.  For  any  constant  c  this  universal 
approximation  goal  is  impossible  to  satisfy  in  the  ranking  model  —  see  Theorem 
32  in  the  appendix. 

Other  natural  goals  include  n-work  factor  [121]  and  a  refinement  called  a- 
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guesswork  [39]  (e.g.,  maximize  the  total  number  of  guesses  needed  to  compromise 
a  fraction  a  of  the  accounts).  While  n-guesswork  is  an  useful  metric  to  analyze  the 
security  of  70  million  Yahoo  passwords  [39],  it  may  not  be  a  desirable  optimization 
goal  for  the  organization  because  it  might  allow  the  adversary  to  crack  up  to 
(a  -  extraction  of  the  accounts  with  relatively  few  guesses. 

Another  interesting  direction  is  to  account  for  an  adversary  with  basic  back¬ 
ground  information  about  the  user  (e.g.,  e-mail  address,  username,  birthday).  It 
may  not  always  be  realistic  to  assume  that  the  adversary  has  no  background  knowl¬ 
edge  because  the  adversary  can  often  easily  obtain  some  background  knowledge 
about  a  user  by  searching  for  publicly  available  information  on  the  internet.  One 
approach  might  be  to  design  a  rule  R  to  specify  different  passwords  for  different 
users  (e.g.,  the  set  of  passwords  that  contain  the  username  or  birthday  of  the  user). 


Open  Questions.  While  we  were  able  to  prove  several  hardness  results  about 
finding  the  optimal  password  composition  policy  in  the  negative  rules  setting, 
it  is  possible  that  these  hardness  results  could  be  circumvented  by  making  mild 
(hopefully  realistic)  assumptions  about  the  underlying  password  distribution  or 
the  rules  R\, ...,  R,„.  Are  there  efficient  algorithms  to  optimize  p  (k,  Ah)  in  the  nega¬ 
tive  rules  setting  given  realistic  assumptions?  It  is  also  possible  that  mild  realistic 
assumptions  could  be  used  to  circumvent  the  impossibility  result  of  Theorem  32, 
and  design  a  universal  approximation  algorithm. 

There  are  also  several  interesting  technical  questions  that  remain  open: 

1.  Normalization  model  with  negative  rules:  Can  we  efficiently  c-approximate 
p  (1,  Ah  )  for  any  constant  c?  Is  there  a  sub-exponential  algorithm  (in  m)  to 
compute  p  (1,  Ah*)? 

2.  Ranking  model  with  positive  rules:  Can  we  efficiently  c-approximate  p  (k,  Ah*) 
for  some  constant  c  when  k  is  a  parameter? 

The  future.  There  is  a  real  need  for  a  principled  approach  to  optimizing  password 
composition  policies.  We  have  taken  a  first  step  in  this  direction  by  providing  an 
intuitive  theoretical  model  and  showing  that  it  leads  to  algorithms  that  perform 
well  on  real  data.  We  can  only  hope  that  our  work  will  spark  a  fundamentally 
new  interaction  between  theory  and  practice  in  passwords  research. 
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Positive  Rule 

Negative  Rule 

Details 

8  characters  or  more 

Less  than  8  characters 

Length  rules 

9  characters  or  more 

Less  than  9  characters 

10  characters  or  more 

Less  than  10  characters 

11  characters  or  more 

Less  than  11  characters 

12  characters  or  more 

Less  than  12  characters 

13  characters  or  more 

Less  than  13  characters 

14  characters  or  more 

Less  than  14  characters 

15  characters  or  more 

Less  than  15  characters 

16  characters  or  more 

Less  than  16  characters 

1  digit  or  more 

Less  than  1  digit 

Character  class  rules 

1  symbol  or  more 

Less  than  1  symbol 

1  lowercase  or  more 

Less  than  1  lowercase 

1  uppercase  or  more 

Less  than  1  uppercase 

2  digits  or  more 

Less  than  2  digits 

2  symbols  or  more 

Less  than  2  symbols 

2  lowercase  or  more 

Less  than  2  lowercase 

2  uppercase  or  more 

Less  than  2  uppercase 

In  a  dictionary 

Not  in  a  dictionary 

Dictionary  checks 

Contains  a  dictionary 
word 

Does  not  contain  a  dictio¬ 
nary  word 

8  characters  or  more 
AND  1  uppercase  or 
more 

Less  than  8  characters  OR 
less  than  1  uppercase 

Combination  Rules 

8  characters  or  more 
AND  1  uppercase  or 
more  AND  1  digit  or 
more 

Less  than  8  characters  OR 
less  than  1  uppercase  OR 
less  than  1  digit 

Table  5.3:  Rules  Used  in  Sampling  Experiments 
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Baseline 

P  (1/  Ms) 

S 

Mean  across  negative  rules  policies 

1.3xl0-2 

Mean  across  positive  rules  policies 

l.OxlO-2 

All  passwords  allowed  (no  policy) 

9.2xl0-3 

One  positive  rule  (S  £  {1  ,...,m}) 

6.8xl0-4 

8  chars,  1  upper,  1  digit 

Optimal  policy  with  positive  rules 

4.4xl0-4 

14  chars  OR  2  symbols  OR 
8  chars,  1  upper,  1  digit 

Optimal  policy  with  negative  rules 

1.4xl0-4 

10  chars  AND  2  digits  AND 

1  symbol  AND  1  lowercase 
AND  not  in  dictionary 

Table  5.4:  Baseline  probabilities  for  the  RockYou  dataset 


Sample  Size 

mean  p  (1,  yis) 

min  p  (1,  Jls) 

%  Optimal 

100 

6.8xl0-3 

1.2xl0-3 

500 

9.7xl0-4 

4.4  x  10"4 

2% 

1000 

9.5xl0-4 

4.4  x  10"4 

10% 

5000 

6.0xl0-4 

4.4  x  10"4 

14% 

10000 

5.7xl0-4 

4.4  x  10"4 

19% 

Table  5.5:  Performance  of  Sampling  Algorithms  with  Positive  Rules 


Random  Decision 

Ban  Smallest 

Sample  Size 

mean  p  (1,  yis) 

min  p  (1,  yis) 

mean  p  (1,  yis) 

min  p  (1,  yis) 

100 

6.8xl0-3 

1.2xl0-3 

7.2xl0-3 

2.3xl0-3 

500 

4.4xl0-3 

6.3xl0-4 

9.0xl0-3 

2.3xl0-3 

1000 

4.3xl0-3 

4.5xl0-4 

8.6xl0-3 

2.3xl0-3 

5000 

6.3xl0-3 

4.5xl0-4 

9.2xl0-3 

9.2xl0-3 

10000 

7.2xl0-3 

4.5xl0-4 

9.2xl0-3 

9.2xl0-3 

Table  5.6:  Performance  of  Sampling  Algortihms  with  Negative  Rules 
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Chapter  6 

GOTCHAs:  A  Defense  Against 
Offline  Attacks 

6.1  Introduction 

Any  adversary  who  has  obtained  the  cryptographic  hash  of  a  user's  password  can 
mount  an  automated  brute-force  attack  to  crack  the  password  by  comparing  the 
cryptographic  hash  of  the  user's  password  with  the  cryptographic  hashes  of  likely 
password  guesses.  This  attack  is  called  an  offline  dictionary  attack,  and  there 
are  many  password  crackers  that  an  adversary  could  use  [63].  Offline  dictionary 
attacks  against  passwords  are  —  unfortunately  —  powerful  and  commonplace 
[87],  Adversaries  have  been  able  to  compromise  servers  at  large  companies  (e.g., 
Zappos,  Linkedln,  Sony,  Gawker  [5,  9,  10,  11,  13,  28])  resulting  in  the  release  of 
millions  of  cryptographic  password  hashes1.  It  has  been  repeatedly  demonstrated 
that  users  tend  to  select  easily  guessable  passwords  [39,  66,  92],  and  password 
crackers  are  able  to  quickly  break  many  of  these  passwords  [136].  Offline  attacks  are 
becoming  increasingly  dangerous  as  computing  hardware  improves  —  a  modern 
GPU  can  evaluate  a  cryptographic  hash  function  like  SHA2  about  250  million  times 
per  second  [165]  —  and  as  more  and  more  training  data  —  leaked  passwords  from 
prior  breaches  —  becomes  available  [87],  Symantec  reported  that  compromised 
passwords  have  significant  economic  value  to  an  adversary  (e.g.,  compromised 
passwords  are  sold  on  black  market  for  between  $4  and  $30  )  [79]. 

HOSPs  (Human-Only  Solvable  Puzzles)  were  suggested  by  Canetti,  Halevi 
1In  a  few  of  these  cases  [5, 10]  the  passwords  were  stored  in  the  clear. 
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and  Steiner  as  a  way  of  defending  against  offline  dictionary  attacks  [51].  The  basic 
idea  is  to  change  the  authentication  protocol  so  that  human  interaction  is  required 
to  verify  a  password  guess.  The  authentication  protocol  begins  with  the  user 
entering  his  password.  In  response  the  server  randomly  generates  a  challenge  — 
using  the  password  as  a  source  of  randomness  —  for  the  user  to  solve.  Finally, 
the  server  appends  the  user's  response  to  the  user's  password,  and  verifies  that 
the  hash  matches  the  record  on  the  server.  To  crack  the  user's  password  offline 
the  adversary  must  simultaneously  guess  the  user's  password  and  the  answer  to 
the  corresponding  puzzle.  The  challenge  should  be  easy  for  a  human  to  solve 
consistently  so  that  a  legitimate  user  can  authenticate.  To  mitigate  the  threat  of  an 
offline  dictionary  attack  the  HOSP  should  be  difficult  for  a  computer  to  solve  — 
even  if  it  has  all  of  the  random  bits  used  to  generate  the  challenge. 

The  basic  HOSP  construction  proposed  by  Canetti  et  al.  [51]  was  to  to  fill  a  hard 
drive  with  regular  CAPTCHAs  (e.g.,  distorted  text)  by  storing  the  puzzles  without 
the  answers.  This  solution  only  provides  limited  protection  against  an  adversary 
because  the  number  of  unique  puzzles  that  can  be  generated  is  bounded  by  the  size 
of  the  hard  drive  (e.g.,  the  adversary  could  pay  people  to  solve  all  of  the  puzzles 
on  the  hard  drive).  See  Appendix  10.2  for  more  discussion.  Finding  a  usable 
HOSP  construction  which  does  not  rely  on  a  very  large  dataset  of  pregenerated 
CAPTCHAs  is  an  open  problem.  Several  candidate  HOSPs  were  experimentally 
tested  [59]  (they  are  called  POSHs  in  the  second  paper),  but  the  usability  results 
were  underwhelming. 


Contributions  In  this  chapter  we  introduce  a  simple  modification  of  HOSPs 
that  we  call  GOTCHAs  (Generating  panOptic  Turing  Tests  to  Tell  Computers  and 
Humans  Apart).  We  use  the  adjective  Panoptic  to  refer  to  a  world  without  privacy 
—  there  are  no  hidden  random  inputs  to  the  puzzle  generation  protocol.  The 
basic  goal  of  GOTCHAs  is  similar  to  the  goal  of  HOSPs  —  defending  against 
offline  dictionary  attacks.  GOTCHAs  differ  from  HOSPs  in  two  ways  (1)  Unlike 
a  HOSP  a  GOTCHA  may  require  human  interaction  during  the  generation  of  the 
challenge.  (2)  We  relax  the  requirement  that  a  user  needs  to  be  able  to  answer  all 
challenges  easily  and  consistently.  If  the  user  can  remember  his  password  during 
the  authentication  protocol  then  he  will  only  ever  see  one  challenge.  We  only 
require  that  the  user  must  be  able  to  answer  this  challenge  consistently.  If  the  user 
enters  the  wrong  password  during  authentication  then  he  may  see  new  challenges. 
We  do  not  require  that  the  user  must  be  able  to  solve  these  challenges  consistently 
because  authentication  will  fail  in  either  case.  We  do  require  that  it  is  difficult 
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Figure  6.1:  Randomly  Generated  Inkblot  Image — An  evil  clown? 


for  a  computer  to  distinguish  between  the  "correct"  challenge  and  an  "incorrect" 
challenge.  Our  main  theorem  demonstrates  that  GOTCFIAs  like  HOSPs  can  be 
used  to  defend  against  offline  dictionary  attacks.  The  goal  of  these  relaxations  is 
to  enable  the  design  of  usable  GOTCFIAs. 

We  introduce  a  candidate  GOTCFIA  construction  based  on  Inkblot  images. 
While  the  images  are  generated  randomly  by  a  computer,  the  human  mind  can 
easily  imagine  semantically  meaningful  objects  in  each  image.  To  generate  a 
challenge  the  computer  first  generates  ten  inkblot  images  (e.g.,  figure  6.1).  The 
user  then  provides  labels  for  each  image  (e.g.,  evil  clown,  big  frog).  During 
authentication  the  challenge  is  to  match  each  inkblot  image  with  the  corresponding 
label.  We  empirically  evaluate  the  usability  of  our  inkblot  matching  GOTCFIA 
construction  by  conducting  a  user  study  on  Amazon's  Mechanical  Turk.  Finally, 
we  challenge  the  AI  community  to  break  our  GOTCFIA  construction. 
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Organization  The  rest  of  this  chapter  is  organized  as  follows:  We  next  discuss 
related  work  in  section  6.1.1.  We  formally  define  GOTCHAs  in  section  6.2  and 
formalize  the  properties  that  a  GOTCHA  should  satisfy.  We  present  our  candidate 
GOTCHA  construction  in  section  6.3,  and  in  section  6.3.1  we  demonstrate  how 
our  GOTCHA  could  be  integrated  into  an  authentication  protocol.  We  present  the 
results  from  our  user  study  in  section  6.3.2,  and  in  section  6.3.3  we  challenge  the  AI 
and  security  communities  to  break  our  GOTCHA  construction.  In  section  6.4  we 
prove  that  GOTCHAs  like  HOSPs  can  also  be  used  to  design  a  password  storage 
system  which  mitigates  the  threat  of  offline  attacks.  We  conclude  by  discussing 
future  directions  and  challenges  in  section  6.5. 


6.1.1  Related  Work 

Inkblots  [148]  have  been  proposed  as  an  alternative  way  to  generate  and  remem¬ 
ber  passwords.  Stubblefield  and  Simon  proposed  showing  the  user  ten  randomly 
generated  inkblot  images,  and  having  the  user  make  up  a  word  or  a  phrase  to  de¬ 
scribe  each  image.  These  phrases  were  then  used  to  build  a  20  character  password 
(e.g.,  users  were  instructed  to  take  the  first  and  last  letter  of  each  phrase).  Usabil¬ 
ity  results  were  moderately  good,  but  users  sometimes  had  trouble  remembering 
their  association.  Because  the  Inkblots  are  publicly  available  there  is  also  a  security 
concern  that  Inkblot  passwords  could  be  guessable  if  different  users  consistently 
picked  similar  phrases  to  describe  the  same  Inkblot. 

We  stress  that  our  use  of  Inkblot  images  is  different  in  two  ways:  (1)  Usability: 
We  do  not  require  users  to  recall  the  word  or  phrase  associated  with  each  Inkblot. 
Instead  we  require  user's  to  recognize  the  word  or  phrase  associated  with  each 
Inkblot  so  that  they  can  match  each  phrase  with  the  appropriate  Inkblot  image. 
Recognition  is  widely  accepted  to  be  easier  than  the  task  of  recall  [23,  155].  (2) 
Security:  We  do  not  need  to  assume  that  it  would  be  difficult  for  other  humans 
to  match  the  phrases  with  each  Inkblot.  We  only  assume  that  it  is  difficult  for  a 
computer  to  perform  this  matching  automatically. 

CAPTCHAs  —  formally  introduced  by  Von  Ahn  et  al.  [152]  —  have  gained 
widespread  adoption  on  the  internet  to  prevent  hots  from  automatically  registering 
for  accounts.  A  CAPTCHA  is  a  program  that  generates  a  puzzle  —  which  should 
be  easy  for  a  human  to  solve  and  difficult  for  a  computer  to  solve  —  as  well  as  a 
solution.  Many  popular  forms  of  CAPTCHAs  (e.g.,  reCAPTCHA  [153])  generate 
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garbled  text,  which  is  easy2  for  a  human  to  read,  but  difficult  for  a  computer  to 
decipher.  Other  versions  of  CAPTCHAs  rely  on  the  natural  human  capacity  for 
audio  [131]  or  image  recognition  [67], 

CAPTCHAs  have  been  used  to  defend  against  online  password  guessing  at¬ 
tacks  —  users  are  sometimes  required  to  solve  a  CAPTCHA  before  signing  into 
their  account.  An  alternative  approach  is  to  lock  out  a  user  after  several  incorrect 
guesses,  but  this  can  lead  to  denial  of  service  attacks  [60].  However,  if  the  ad¬ 
versary  has  access  to  the  cryptographic  hash  of  the  user's  password,  then  he  can 
circumvent  all  of  these  requirements  and  execute  an  automatic  dictionary  attack 
to  crack  the  password  offline.  By  contrast  HOSPs  —  proposed  by  Canetti  et  al.[51] 
—  were  proposed  to  defend  against  offline  attacks.  HOSPs  are  in  some  ways  sim¬ 
ilar  to  CAPTCHAs  (Completely  Automated  Turing  Tests  to  Tell  Computers  and 
Humans  Apart)  [152],  CAPTCHAs  are  widely  used  on  the  internet  to  fight  spam 
by  preventing  hots  from  automatically  registering  for  accounts.  In  this  setting 
a  CAPTCHA  is  sent  to  the  user  as  a  challenge,  while  the  secret  solution  is  used 
to  grade  the  user's  answer.  The  implicit  assumption  is  that  the  answer  and  the 
random  bits  used  to  generate  the  puzzle  remain  hidden  —  otherwise  a  spam  hot 
could  simply  regenerate  the  puzzle  and  the  answer.  While  this  assumption  may 
be  reasonable  in  the  spam  hot  setting,  it  does  not  hold  in  our  offline  password 
attack  setting  in  which  the  server  has  already  been  breached.  A  HOSP  is  different 
from  a  CAPTCHA  in  several  key  ways:  (1)  The  challenge  must  remain  difficult 
for  a  computer  to  solve  even  if  the  random  bits  used  to  generate  the  puzzle  are 
made  public.  (2)  There  is  no  single  correct  answer  to  a  HOSP.  It  is  okay  if  different 
people  give  different  responses  to  a  challenge  as  long  as  people  can  respond  to  the 
challenges  easily,  and  each  user  can  consistently  answer  the  challenges. 

The  only  HOSP  construction  proposed  in  [51]  involved  stuffing  a  hard  drive 
with  unsolved  CAPTCHAs.  The  problem  of  finding  a  HOSP  construction  that  does 
not  rely  on  a  dataset  of  unsolved  CAPTCHAs  was  left  as  an  open  problem  [51]. 
Several  other  candidate  HOSP  constructions  have  been  experimentally  evaluated 
in  subsequent  work  [59]  (they  are  called  POSHs  in  the  second  paper),  but  the 
usability  results  for  every  scheme  that  did  not  rely  on  a  large  dataset  on  unsolved 
CAPTCHAs  were  underwhelming. 

GOTCHAs  are  very  similar  to  HOSPs.  The  basic  application  —  defending 
against  offline  dictionary  attacks  —  is  the  same  as  are  the  key  tools:  exploiting 
the  power  of  interaction  during  authentication,  exploiting  hard  artificial  intelli¬ 
gence  problems.  While  the  authentication  with  HOSPs  is  interactive,  the  initial 

2Admitedly  some  people  would  dispute  the  use  of  the  label  'easy.' 
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generation  of  the  puzzle  is  not.  By  contrast,  our  GOTCHA  construction  requires 
human  interaction  during  the  initial  generation  of  the  puzzle.  This  simple  relax¬ 
ation  allows  for  the  construction  of  new  solutions.  In  the  HOSP  paper  humans 
are  simply  modeled  as  a  puzzle  solving  oracle,  and  the  adversary  is  assumed  to 
have  a  limited  number  of  queries  to  a  human  oracle.  We  introduce  a  more  intricate 
model  of  the  human  agent  with  the  goal  of  designing  more  usable  constructions. 


Password  Storage  Password  storage  is  an  incredibly  challenging  problem.  Ad¬ 
versaries  have  been  able  to  compromise  servers  at  many  large  companies  (e.g., 
Zappos,  Linkedln,  Sony,  Gawker  [5,  9, 10, 11, 13,  28]).  For  example,  hackers  were 
able  to  obtain  32  million  plaintext  passwords  from  RockYou  using  a  simple  SQL  in¬ 
jection  attack  [5].  While  it  is  considered  an  extremely  poor  security  practice  to  store 
passwords  in  the  clear  [141],  the  practice  is  still  fairly  common  [5,  10,  41].  Many 
other  companies  [13, 41]  have  used  cryptographic  hashes  to  store  their  passwords, 
but  failed  to  adopt  the  practice  of  salting  (e.g.,  instead  of  storing  the  cryptographic 
hash  of  the  password  H(pio)  the  server  stores  (H  (pzv,  r ) ,  r)  for  a  random  string  r 
[16])  to  defend  against  rainbow  table  attacks.  Rainbow  tables,  which  consist  of 
precomputed  hashes,  are  often  used  by  an  adversary  to  significantly  speed  up  a 
password  cracking  attack  because  the  same  table  can  be  reused  to  attack  each  user 
when  the  passwords  are  unsalted  [117]. 

Cryptographic  hash  functions  like  SHA1,  SHA2  and  MD5  —  designed  for  fast 
hardware  computation  —  are  popular  choices  for  password  hashing.  Unfortu¬ 
nately,  this  allows  an  adversary  to  try  up  to  250  million  guesses  per  second  on  a 
modern  GPU  [165].  The  BCRYPT  [122]  hash  function  was  designed  specifically 
with  passwords  in  mind  —  BCRYPT  was  intentionally  designed  to  be  slow  to 
compute  (e.g.,  to  limit  the  power  of  an  adversary's  offline  attack).  The  BCRYPT 
hash  function  takes  a  parameter  which  allows  the  programmer  to  specify  how 
costly  the  hash  computation  should  be.  The  downside  to  this  approach  is  that  it 
also  increases  costs  for  the  company  that  stores  the  passwords  (e.g.,  if  we  want 
it  to  cost  the  adversary  $1,000  for  every  million  guesses  then  it  will  also  cost  the 
company  at  least  $1,000  for  every  million  login  attempts). 

Users  are  often  advised  (or  required)  to  follow  strict  guidelines  when  selecting 
their  password  (e.g.,  use  a  mix  of  upper/lower  case  letters,  include  numbers  and 
change  the  password  frequently)  [133].  However,  empirical  studies  show  that 
user's  are  are  often  frustrated  by  restricting  policies  and  commonly  forget  their 
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passwords  [34,  75,  102]3.  Furthermore,  the  cost  of  these  restrictive  policies  can  be 
quite  high.  For  example,  a  Gartner  case  study  [158]  estimated  that  it  cost  over  $17 
per  password-reset  call.  Florencio  and  Herley  [76]  studied  the  economic  factors 
that  institutions  consider  before  adopting  password  policies  and  found  that  they 
often  value  usability  over  security. 


6.2  Definitions 

In  this  section  we  seek  to  establish  a  theoretical  basis  for  GOTCHAs.  Several 
of  the  ideas  behind  our  definitions  are  borrowed  from  theoretical  definitions  of 
CAPTCHAs  [152]  and  HOSPs  [51].  Like  CAPTCHAs  and  HOSPs,  GOTCHAs  are 
based  on  the  assumption  that  some  AI  problem  is  hard  for  a  computer  to  solve,  but 
easy  for  a  person  to  solve.  Ultimately,  these  assumptions  are  almost  certainly  false 
(e.g.,  because  the  human  brain  can  solve  a  GOTCHA  it  is  reasonable  to  believe 
that  there  exists  a  computer  program  to  solve  the  problems).  However,  it  may  still 
be  reasonable  to  assume  that  these  problems  cannot  be  solved  by  applying  known 
ideas.  By  providing  a  formal  definition  of  GOTCHAs  we  can  determine  whether 
or  not  a  new  idea  can  be  used  to  break  a  candidate  GOTCHA  construction. 

We  use  c  £  C  to  denote  the  space  of  challenges  that  might  be  generated.  We 
use  'H  to  denote  the  set  of  human  users  and  H  (c,  ot)  to  denote  the  response  that 
a  human  H  £  'H  gives  to  the  challenge  c  £  C  at  time  t.  Here,  ot  denotes  the 
state  of  the  human's  brain  at  time  t.  ot  is  supposed  to  encode  our  user's  existing 
knowledge  (e.g.,  vocabulary,  experiences)  as  well  as  the  user's  mental  state  at  time 
t  (e.g.,  what  is  the  user  thinking  about  at  time  t).  Because  ot  changes  over  time  (e.g., 
new  experiences)  we  use  H  (c)  =  {H  ( c,ot )  1 1  £  N]  to  denote  the  set  of  all  answers 
a  human  might  give  to  a  challenge  c.  We  use  A(  to  denote  the  range  of  possible 
responses  (answers)  that  a  human  might  give  to  the  challenges. 

Definition  12.  Given  a  metric  d  :  54  x  — >  IR,  we  say  that  a  human  H  can  consistently 

solve  a  challenge  c  £  C  with  accuracy  a  if 'it  £  N 

d(H(c/0O),H(c,0t ))  <  a, 

where  cr0  denotes  the  state  of  the  human's  brain  when  he  initially  answers  the  challenge.  If 
| H  (c)|  =  1  then  we  simply  say  that  the  human  can  consistently  solve  the  challenge. 

3In  fact  the  resulting  passwords  are  sometimes  more  vulnerable  to  an  offline  attack!  [34, 102] 
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Notation:  When  we  have  a  group  of  challenges  (ci,...,Ck)  we  will  sometimes 
write  H  ((o, . . . ,  cf),  ot )  =  < H  {c\,  ot) , . . . ,  H  (ck,  ot ))  for  notational  convenience.  We 
use  1/  ~  D  to  denote  a  random  sample  from  the  distribution  D,  and  we  use 
r  ~  {0,1}"  to  denote  a  element  drawn  from  the  set  {0,1}"  uniformly  at  random. 
We  stress  that  while  H  denotes  a  human  user  in  this  chapter,  H  still  denotes  a 
cryptographic  hash  that  would  be  evaluated  by  a  computer  as  in  the  rest  of  this 
thesis. 

One  of  the  requirements  of  a  HOSP  puzzle  system  [51]  is  that  the  human  H 
must  be  able  to  consistently  answer  any  challenge  that  is  generated  (e.g.,  Vc  e  C, 
H  can  consistently  solve  c ).  These  requirements  seem  to  rule  out  promising  ideas 
for  HOSP  constructions  like  Inkblots  [59].  In  this  construction  the  challenge  is  a 
randomly  generated  inkblot  image  7,  and  the  response  H  (I,  <j0)  is  word  or  phrase 
describing  what  the  user  initially  sees  in  the  inkblot  image  (e.g.,  evil  clown,  soldier, 
big  lady  with  a  ponytail).  User  studies  have  shown  that  H  ( 7,cr0 )  does  not  always 
match  H  (I,  ot )  —  the  phrase  describing  what  the  user  sees  at  time  t  [59].  In  a  few 
cases  the  errors  may  be  correctable  (e.g.,  capitalization,  plural/singular  form  of  a 
word),  but  oftentimes  the  phrase  was  completely  different  —  especially  if  a  long 
time  passed  in  between  trials4.  By  contrast,  our  GOTCHA  construction  does  not 
require  the  user  to  remember  the  phrases  associated  with  each  Inkblot.  Instead 
we  rely  on  a  much  weaker  assumption  —  the  user  can  consistently  recognize  his 
solutions.  We  say  that  a  human  can  recognize  his  solutions  to  a  set  of  challenges  if 
he  can  consistently  solve  a  matching  challenge  (definition  13)  in  which  he  is  asked 
to  match  each  of  his  solutions  with  the  corresponding  challenge. 

Definition  13.  Given  an  integer  k,  and  a  permutation  tl  :  [k]  [k],  a  matching 

challenge  cn  =  (c,  a)  e  C  of  size  k  is  given  by  a  k-tuple  of  challenges  c  =  (cn(  p, . . . ,  c„w)  e 
Ck  and  solutions  a  =  H  {{c\, . . .  ,cf),Oo).  The  response  to  a  matching  challenge  is  a 
permutation  n '  =  H(cnrot). 

For  permutations  n  :  [7c]  — >  [k]  we  use  the  distance  metric 
dk  (tti,  tt2)  =  \{i  |  nfi)  +  n2(i)  A  1  <  i  <  k}\  . 

dk  (jiif  Tif)  simply  counts  the  number  of  entries  where  the  permutations  don't 
match.  We  say  that  a  human  can  consistently  recognize  his  solution  to  a  matching 

4We  would  add  the  requirement  that  the  human  must  be  able  to  consistently  answer  the  chal¬ 
lenges  without  spending  time  memorizing  and  rehearsing  his  response  to  the  challenge.  Otherwise 
we  could  just  as  easily  force  the  user  to  remember  a  random  string  to  append  on  to  his  password. 
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challenge  cn  with  accuracy  a  if  Vt.dk  ( H  (cn,  ot )  ,tl)  <  a.  We  use  {71'  |  dk  (tl,  n')  <  a) 
to  denote  the  set  of  permutations  n'  that  are  a-close  to  tl. 

The  puzzle  generation  process  for  a  GOTCHA  involves  interaction  between  the 
human  and  a  computer:  (1)  The  computer  generates  a  set  of  k  challenges.  (2)  The 
human  solves  these  challenges.  (3)  The  computer  uses  the  solutions  to  produce  a 
final  challenge5.  Formally, 

Definition  14.  A  puzzle-system  is  a  pair  (Gi,  G2),  where  G\  is  a  randomized  challenge 
generator  that  takes  as  input  lk  (with  k  security  parameter)  and  a  pair  of  random  bit 
strings  r\,r2  £  {0,1}*  wnd  outputs  k  challenges  (c\,  ...,ck)  <—  Gi  (lk,  ri,r2).  G2  is  a 
randomized  challenge  generator  that  takes  as  input  lk  (security  parameter),  a  random  bit 
string  r1  e  {0, 1}*,  and  proposed  answers  a  =  (alf ...,  ak)  to  the  challenges  G\  (lk,  rlf  r2)  and 
outputs  a  challenge 

c  < —  G2  (lk,  ri,aj.  We  say  that  the  puzzle-system  is  (a,  j6)-usable  if 

Pr  [Accurate  ( H ,  c,  a)]>  B  , 

H~9C 

whenever  a  =  H  (Gi  (lk,  r\,  r2) ,  cr0)/  where  Accurate  (hi,  c,  a)  denotes  the  event  that  the 
human  hi  can  consistently  solve  c  with  accuracy  a. 

In  our  authentication  setting  the  random  string  r\  is  extracted  from  the  user's 
password  using  a  strong  pseudorandom  function  Extract.  To  provide  a  concrete 
example  of  a  puzzle-system,  G\  could  be  a  program  that  generates  a  set  of  inkblot 
challenges  (l\, . . . ,  Ik)  using  random  bits  rk,  selects  a  random  permutation  71  :  [k\  — > 
[k]  using  random  bits  r2,  and  returns  (In(i), . . .  ,In{k))-  The  human's  response  to  an 
Inkblot  —  H  (ij,  00)  —  is  whatever  he/she  imagines  when  he  sees  the  inkblot  Ij  for 
the  first  time  (e.g.,  some  people  might  imagine  an  evil  clown  when  they  look  at 
figure  6.1).  Finally,  G?  might  generate  Inkblots  c  =  (I\, ...  ,4)  using  random  bits  r\, 
and  return  the  matching  challenge  cn  =  ( c,a ).  In  this  case  the  matching  challenge 
is  for  the  user  to  match  his  labels  with  the  appropriate  Inkblot  images  to  recover 
the  permutation  n.  Observe  that  the  final  challenge  —  cn  —  can  only  be  generated 
after  a  round  of  interaction  between  the  computer  and  a  human.  By  contrast,  the 
challenges  in  a  HOSP  must  be  generated  automatically  by  a  computer.  Also  notice 
that  if  G2  is  executed  with  a  different  random  bit  string  r'  then  we  do  not  require 
the  resulting  challenge  to  be  consistently  recognizable  (e.g.,  if  the  user  enters  in 

5 We  note  that  a  HOSP  puzzle  system  (G)  [51]  can  be  modeled  as  a  GOTCHA  puzzle  system 
(Gi,  G2)  where  Gi  does  nothing  and  G2  simply  runs  G  to  generate  the  final  challenge  c  directly 
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the  wrong  password  then  authentication  will  fail  regardless  of  how  he  solves  the 
resulting  challenge).  For  example,  if  the  user  enters  the  wrong  password  the 
user  might  be  asked  to  match  his  labels  {£n(\), ...,  tn(k))  -  H  ((fn(i)/  •  •  • ,  In(k))/  Go)  with 
Inkblots  (I 'v that  he  has  never  seen. 

An  adversary  could  attack  a  puzzle  system  by  either  (1)  attempting  to  distin¬ 
guish  between  the  correct  puzzle,  and  puzzles  that  might  be  meaningless  to  the 
human,  or  (2)  by  solving  the  matching  challenge  directly. 


We  say  that  an  algorithm  A  can  distinguish  distributions  T>\  and  £)?  with 
advantage  e  if 


Pr  [A(z)  =  l]-  Pr  [A  (y)  =  1] 

x~Vi  y~V  2 


>  €  . 


Our  formal  definition  of  a  GOTCHA  is  found  in  definition  15.  Intuitively, 
definition  15  says  that  (1)  The  underlying  puzzle-system  should  be  usable  —  so 
that  legitimate  users  can  authenticate.  (2)  It  should  be  difficult  for  the  adversary  to 
distinguish  between  the  correct  matching  challenge  (e.g.,  the  one  that  the  user  will 
see  when  he  types  in  the  correct  password),  and  an  incorrect  matching  challenge 
(e.g.,  if  the  user  enters  the  wrong  password  he  will  be  asked  to  match  his  labels 
with  different  Inkblot  images),  and  (3)  It  should  be  difficult  for  the  adversary  to 
distinguish  between  the  user's  matching,  and  a  random  matching  drawn  from  a 
distribution  R  with  sufficiently  high  minimum  entropy. 

Definition  15.  A  puzzle-system  (Gi,G?)  is  an  (a,  fi,  e,  6,  p)-GOTCHA  if  (1)  (Gi,  G?)  is 
(a,  /3)-usable  (2)  Given  a  human  H  e  'H  no  probabilistic  polynomial  time  algorithm  can 
distinguish  between  distributions 

ri,r2~{0,l}"} 


£>i  = 


H(G|  (iVi  ,r2),oo), 
G2(T,Pi,H(G1(lVi,r2)/ao)) 


and 

ri,r2,r3  ~  {0,1}"} 

with  advantage  greater  than  e,  and  (3)  Given  a  human  H  e  'H,  there  is  a  distribution  R(c) 
with  p(m)  bits  of  minimum  entropy  such  that  no  probabilistic  polynomial  time  algorithm 
can  distinguish  between  distributions 


£>2 


H(Gi(lVi,r2),ffo), 

G2(h,r3,H(G1(lVx,r2),ffo)) 


£>3 


I  H(Gi(lfc/ri/r2)/a0) 

J  G2(lVi,H(G1(lVi,r2),ffo)), 
^H(G2(h,r1/H(G1(lG1,r2)/ffo))^o) 


ri,r2~{0,l}" 
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and 


D4  = 


g2(  IVi 


H(C1(lt/r1,r2),ao) 

k  i-r(nJik 


H(Gi(lk,r1,r2)4Jo)), 


R(G2(lm,r1,(a1/.../am)),a0) 


r\j2  ~  {0,1}' 


with  advantage  greater  then  5. 


6.2.1  Password  Storage  and  Offline  Attacks 


To  protect  users  in  the  event  of  a  server  breach  organizations  are  advised  to  store 
salted  password  hashes  —  using  a  cryptographic  hash  function  (H  :  {0,1}* 

{0, 1}")  and  a  random  bit  string  (s  G  {0, 1}*)  [133].  For  example,  if  a  user  (u)  chose 
the  password  (pw)  the  server  would  store  the  tuple  ( n ,  s,  H  (s,  pw)).  Any  adversary 
who  has  obtained  ( u ,  s,  H  (s,  pw))  (e.g.,  through  a  server  breach)  may  mount  a  — 
fully  automated  —  offline  dictionary  attack  using  powerful  password  crackers 
like  John  the  Ripper  [63].  To  verify  a  guess  pw'  the  adversary  simply  computes 
H  (s,pw')  and  checks  to  see  if  this  hash  matches  H  ( s,pw ). 

We  assume  that  an  adversary  Adv  who  breaches  the  server  can  obtain  the  code 
for  h,  as  well  as  the  code  for  any  GOTCHAs  used  in  the  authentication  protocol. 
Given  the  code  for  h  and  the  salt  value  s  the  adversary  can  construct  a  function 


VerifyHash  (pw') 


1  if  H  (s,  pw)  =  H  (s,  pw') 
0  otherwise. 


We  also  allow  the  adversary  to  have  black  box  access  to  a  GOTCHA  solver  (e.g., 
a  human).  We  use  Ch  to  denote  the  cost  of  querying  a  human  and  Ch  to  denote 
the  cost  of  querying  the  function  VerifyHash6,  and  we  use  nH  (resp.  nH)  to  denote 
the  number  of  queries  to  the  human  (resp.  VerifyHash).  Queries  to  the  human 
GOTCHA  solver  are  much  more  expensive  than  queries  to  the  cryptographic 
hash  function  ( Ch  »  Ch)  [110].  For  technical  reasons  we  limit  our  analysis  to 
conservative  adversaries. 

Definition  16.  We  say  that  an  adversary  Adv  is  conservative  if  (1)  Adv  uses  the 
cryptographic  hash  function  H  in  a  black  box  manner  (e.g.,  the  hash  function  H  and  the 
stored  hash  value  are  only  used  to  construct  a  subroutine  VerifyHash  which  is  then  used 
as  a  black  box  by  Adv  ),  (2)  The  pseudorandom  function  Extract  is  used  as  a  black  box, 
and  (3)  The  adversary  only  queries  a  human  about  challenges  generated  using  a  password 
guess. 

6The  value  of  Ch  may  vary  widely  depending  on  the  particular  cryptographic  hash  function  — 
it  is  inexpensive  to  evaluate  SHA1,  but  BCRYPT  [122]  may  be  very  expensive  to  evaluate. 
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It  is  reasonable  to  believe  that  our  adversary  is  conservative.  All  existing 
password  crackers  (e.g.,  [63])  use  the  hash  function  as  a  blackbox,  and  it  is  difficult 
to  imagine  that  the  adversary  would  benefit  by  querying  a  human  solver  about 
Inkblots  that  are  unrelated  to  the  password. 

We  use  D  c  {0, 1}*  to  denote  a  dictionary  of  likely  guesses  that  the  adversary 
would  like  to  try. 

Cost  (Adv,  D)  =  (uhCh  +  nHCn) 

to  denote  the  cost  of  the  queries  that  the  adversary  makes  to  check  each  guess  in 
D,  and  Succeed  (Adv,  D,  pzv)  to  denote  the  event  that  the  adversary  makes  a  query 
to  VerifyHash  that  returns  1  (e.g.,  the  adversary  successfully  finds  the  user's 
password  pzv).  The  adversary  might  use  a  computer  program  to  try  to  solve  some 
of  the  GOTCHAs  —  to  save  cost  by  not  querying  a  human.  However,  in  this  case 
the  adversary  might  fail  to  crack  the  password  because  the  GOTCHA  solver  found 
the  wrong  solution  to  one  of  the  challenges. 

Definition  17.  An  adversary  Adv  is  ( C,y ,  D)-successful  if  Cost  (Adv,  D)  <  C,  and 

Pr  [Succeed  (Adv,  D,  pzv)]  >  y  . 

pw~D 


Our  attack  model  is  slightly  different  from  the  attack  model  in  [51].  They 
assume  that  the  adversary  may  ask  a  limited  number  of  queries  to  a  human 
challenge  solution  oracle.  Instead  we  adopt  an  economic  model  similar  to  [30], 
and  assume  that  the  adversary  is  instead  limited  by  a  budget  C,  which  may  be 
used  to  either  evaluate  the  cryptographic  hash  function  H  or  query  a  human  H. 


6.3  Inkblot  Construction 

Our  candidate  GOTCHA  construction  is  based  on  Inkblots  images.  We  use  algo¬ 
rithm  6.1  to  generate  inkblot  images.  Algorithm  6.1  takes  as  input  random  bits  r\ 
and  a  security  parameter  k  —  which  specifies  the  number  of  Inkblots  to  output. 
Algorithm  6.1  makes  use  of  the  randomized  subroutine 

DrawRandomEllipsePairs  (I,  t,  zvidth,  height)  which  draws  t  pairs  of  ellipses  on  the 
image  I  with  the  specified  width  and  height.  The  first  ellipse  in  each  pair  is  drawn 
at  a  random  (x,  y)  coordinate  on  the  left  half  of  the  image  with  a  randomly  selected 
color  and  angle  a  of  rotation,  and  the  second  ellipse  is  mirrored  on  the  right  half 
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Algorithm  6.1  Generatelnkblotlmages 

Input:  Security  Parameter  lk,  Random  bit  string  r\  £  {0, 1}*. 
for  j  =  1, ...  ,k  do 

Ij  <—  new  Blank  Image  >  The  following  operations  only  use  the  random  bit 
string  Y\  as  a  source  of  randomness 

DrawRandomEllipsePairs  (ij,  150, 60, 60) 

DrawRandomEllipsePairs  (ij,  70, 20, 20) 

DrawRandomEllipsePairs  (ij,  150, 60, 20) 
return  (4,  •  •  • ,  4)  >  Inkblot  Images 


of  the  image.  Figure  6.1  is  an  example  of  an  Inkblot  image  generated  by  algorithm 

6.1. 

Our  candidate  GOTCHA  is  given  by  the  pair  {G\,  G2)  —  algorithms  6.2  and  6.3. 
Gi  runs  algorithm  6.1  to  generate  k  Inkblot  images,  and  then  returns  these  images 
in  permuted  order  —  using  a  function  GenerateRandomPermutation  ( k ,  r),  which 
generates  a  random  permutation  n  :  [k]  — »  [k]  using  random  bits  r.  G2  also  runs 
algorithm  6.1  to  generate  k  Inkblot  images,  and  then  outputs  a  matching  challenge. 


Algorithm  6.2  G\ 

Input:  Security  Parameter  lk,  Random  bit  strings  rlx  r2  £  {0, 1}*. 

(4, ...  ,4)  <—  Generatelnkblotlmages  (k, r\ ) 

7i  <—  GenerateRandomPermutation  (k,  r2) 
return  (InW/ . . .  ,In(k)) 


After  the  Inkblots  (In(i), . . .  ,In(k))  have  been  generated,  the  human  user  is 
queried  to  provide  labels  tn(\),  ■  ■  ■ ,  t n(k )  where 

■  ■  ■  /  k-n(k)}  —  H  ■  ■  ■  r  O’o)  • 

In  our  authentication  setting  the  server  would  store  the  labels  4i(i),  •  •  • ,  tn{k)  in 
permuted  order.  The  final  challenge  —  generated  by  algorithm  6.3  —  is  to  match 
the  Inkblot  images  4, ...  ,4  with  the  user  generated  labels  ...,  4  to  recover  the 
permutation  n. 

Observation:  Notice  that  if  the  random  bits  provided  as  input  to 

Generatelnkblotlmages  and  GenerateMatchingChallenge  match  that  the  user 
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Algorithm  6.3  GenerateMatchingChallenge  G 2 

Input:  Security  Parameter  lk,  Random  bits  r\  e  {0,1}*  and  labels  a  = 

(4(1)/  •  •  •  /  4(fc))- 

(h, ...,1k)  Generatelnkblotlmages  (1*,  o) 

return  4  =  (c,a)  >  Matching  Challenge 


will  see  the  same  Inkblot  images  in  the  final  matching  challenge.  However,  if  the 
random  bits  do  not  match  (e.g.,  because  the  user  typed  the  wrong  password  in  our 
authentication  protocol)  then  the  user  will  see  different  Inkblot  images.  The  labels 
t\, . . . ,  lk  wiH  be  the  same  in  both  cases. 

6.3.1  GOTCHA  Authentication 

To  illustrate  how  our  GOTCHAs  can  be  used  to  defend  against  offline  attacks  we 
present  the  following  authentication  protocols:  Create  Account  (protocol  6.3.1) 
and  Authenticate  (protocol  6.3.2).  Communication  in  both  protocols  should  take 
place  over  a  secure  channel.  Both  protocols  involve  several  rounds  of  interaction 
between  the  user  and  the  server.  To  create  a  new  account  the  user  sends  his 
username/password  to  the  server,  the  server  responds  by  generating  k  Inkblot 
images  I\, ... ,  h,  and  the  user  provides  a  response  (t\, . . . ,  4)  =  H  ((I1/ . . . ,  Ik),  o0) 
based  on  his  mental  state  at  the  time  —  the  server  stores  these  labels  in  permuted 
order  4(1)/ . . . ,  4(fc)7-  To  authenticate  later  the  user  will  have  to  match  these  labels 
with  the  corresponding  inkblot  images  to  recover  the  permutation  n. 

In  section  6.4  we  argue  that  the  adversary  who  wishes  to  mount  a  cost  effective 
offline  attack  needs  to  obtain  constant  feedback  from  a  human.  Following  [51]  we 
assume  that  the  function  Extract :  {0, 1}*  — >  {0, 1}”  is  a  strong  randomness  extractor, 
which  can  be  used  to  extract  random  strings  from  the  user's  password.  Recall  that 
H  :  {0, 1}*  — »  {0, 1}*  denotes  a  cryptographic  hash  function. 

Our  protocol  could  be  updated  to  allow  the  user  to  reject  challenges  he  found 
confusing  during  account  creation  in  protocol  6.3.1.  In  this  case  the  server  would 
simply  note  that  the  first  GOTCHA  was  confusing  and  generate  a  new  GOTCHA. 

7For  a  general  GOTCHA,  protocol  6.3.1  would  need  to  have  an  extra  round  of  communication. 
The  server  would  send  the  user  the  final  challenge  generated  by  G2  and  the  user  would  respond 
with  H(G2  (,),o0).  Protocol  6.3.1  takes  advantage  of  the  fact  that  n  =  H  (C2  (, ) ,  ffo)  is  already 
known. 
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Protocol  6.3.1:  Create  Account 


Security  Parameters:  k,  n. 

(User):  Select  username  ( u )  and  password  (piv)  and  send  ( u,pw )  to  the  server. 
(Server):  Sends  Inkblots  (Ilf . . . ,  4)  to  the  user  where: 
r’  ~  {0,11"/  A  Extract  (pw,  r'),  r2  ~  {0,1}”  and 
(4, ...  ,h)  Generatelnkblotlmages  (I'"/  u) 

(User):  Sends  responses  ( i\ , ...,  4)  back  to  the  server  where: 

<4, H  «A, . . . ,  h),  Oq). 

(Server):  Store  the  tuple  t  where  t  is  computed  as  follows: 

Salt:  s  ~  {0,1}” 

7i  <—  GenerateRandomPermutation  ( k ,  r2). 

V’  <-  H  (ii,  s,  piv,  7l(l),  ...,  7z(/c)) 

1  ^  {u,  T  ,  S,  fZpu;,  £n(l)r  •  •  ■  r  kn(k}j 


Once  our  user  has  created  an  account  he  can  login  by  following  protocol  6.3.2. 

Claim  6  says  that  a  legitimate  user  can  successfully  authenticate  if  our  Inkblot 
construction  satisfies  the  usability  requirements  of  a  GOTCHA.  The  proof  of  claim 
6  can  be  found  in  appendix  10.1. 

Claim  6.  If  (Gi,  G2)  is  a  (a,  f,  e,  b,  p)-GOTCHA  then  at  least  f> -fraction  of  humans  can 
successfully  authenticate  using  protocol  6.3.2  after  creating  an  account  using  protocol 
6.3.1. 

One  way  to  improve  usability  of  our  authentication  protocol  is  to  increase  the 
neighborhood  of  acceptably  close  matchings  by  increasing  a.  The  disadvantage  is 
that  the  running  time  for  the  server  in  protocol  6.3.2  increases  with  the  size  of  a. 
Claim  7  bounds  the  time  needed  to  enumerate  over  all  close  permutations.  The 
proof  of  claim  7  can  be  found  in  appendix  10.1. 

Claim  7.  For  all  permutations  n  :  [k]  — >  [k]  and  a  >  0 

\{n'  |  dk  (7i,  n')  <  a}\  <  1  +  ^  il . 


For  example,  if  the  user  matches  k  =  10  Inkblots  and  we  want  to  accept  match¬ 
ings  that  are  off  by  at  most  a  =  5  entries  then  the  server  would  need  to  enumerate 
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Protocol  6.3.2:  Authenticate 


Security  Parameters:  k,  n. 

Usability  Parameter:  a 

(User):  Send  username  ( u )  and  password  ( pw ')  —  pw'  may  or  may  not  be  correct. 
(Server):  Sends  challenge  c  to  the  user  where  c  is  computed  as  follows: 

Find  t  —  ( u ,  r  ,s,  hpW,  €n(\)>  •  •  • ,  I n(k 
r\  <—  Extract  (pw' ,  r') 

(I], ...,/')  <—  Generatelnkblotlmages  (r[,k) 

Cn  *  ((Jl/  •••/  Ik),  i^n(l)/  ■  ■  ■  / 1 n(k ))) 

(User):  Solves  cn  and  sends  the  answer  n'  -  H  ( c,ot ). 

(Server): 

for  all  7to  s.t  dk  {no,  n')  <  a  do 

hpw,  o  <-  H  (u,  s,  pw',  710(1), ...,  n0(k)) 
if  hpW/0  =  hpw  then 
Authenticate 

Deny 


over  at  most  36,091  permutations8.  Organizations  are  already  advised  to  use 
password  hash  functions  like  BCRYPT  [122]  which  intentionally  designed  to  be 
slower  than  standard  cryptographic  hash  functions  —  often  by  a  factor  of  millions. 
Instead  of  making  the  hash  function  a  million  times  slower  to  evaluate  the  server 
might  instead  make  the  hash  function  a  thousand  times  slower  to  evaluate  and 
use  these  extra  computation  cycles  to  enumerate  over  close  permutations.  The 
organization's  trade-off  is  between:  security,  usability  and  the  resources  that  it 
needs  to  invest  during  the  authentication  process. 

We  observe  that  an  adversary  mounting  an  online  attack  would  be  naturally 
rate  limited  because  he  would  need  to  solve  a  GOTCHA  for  each  new  guess. 
Protocol  6.3.2  could  also  be  supplemented  with  a  /c-strikes  policy  —  in  which  a 
user  is  locked  out  for  several  hours  after  k  incorrect  login  attempts  —  if  desired. 


8A  more  precise  calculation  reveals  that  there  are  exactly  13, 264  permutations  s.t.  d\ q  (tc',  n)  <  5 
and  a  random  permutation  n'  would  only  be  accepted  with  probability  3.66  X  10-3 
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6.3.2  User  Study 


To  test  our  candidate  GOTCHA  construction  we  conducted  an  online  user  study9. 
We  recruited  participants  through  Amazon's  Mechanical  Turk  to  participate  in 
our  study.  The  study  was  conducted  in  two  phases.  In  phase  1  we  generated  ten 
random  Inkblot  images  for  each  participant,  and  asked  each  participant  to  provide 
labels  for  their  Inkblot  images.  Participants  were  advised  to  use  creative  titles  (e.g., 
evil  clown,  frog,  lady  with  poofy  dress)  because  they  would  not  need  to  remember 
the  exact  titles  that  they  used.  Participants  were  paid  $1  for  completing  this  first 
phase.  A  total  of  70  users  completed  phase  1. 

After  our  participants  completed  the  first  phase  we  waited  ten  days  before 
asking  our  participants  to  return  and  complete  phase  2.  During  phase  2  we  showed 
each  participant  the  Inkblot  images  they  saw  in  phase  1  (in  a  random  order)  as  well 
as  the  titles  that  they  created  during  phase  1  (in  alphabetical  order).  Participants 
were  asked  to  match  the  labels  with  the  appropriate  image.  The  purpose  of  the 
longer  waiting  time  was  to  make  sure  that  participants  had  time  to  forget  their 
images  and  their  labels.  See  figure  6.3  for  an  example  of  phase  2.  Participants  were 
paid  an  additional  $1  for  completing  phase  2  of  the  user  study.  At  the  beginning 
of  the  user  study  we  let  participants  know  that  they  would  be  paid  during  phase 
2  even  if  their  answers  were  not  correct.  We  adopted  this  policy  to  discourage 
cheating  (e.g.,  using  screen  captures  from  phase  1  to  match  the  images  and  the 
labels)  and  avoid  positively  biasing  our  results. 

We  measured  the  time  it  took  each  participant  to  complete  phase  1.  Our  results 
are  summarized  in  Table  6.1.  It  is  quite  likely  that  some  participants  left  their 
computer  in  the  middle  of  the  study  and  returned  later  to  complete  the  study 
(e.g.,  one  user  took  57.5  minutes  to  complete  the  study).  While  we  could  not 
measure  time  away  from  the  computer,  we  believe  that  it  is  likely  that  at  least 
9  of  our  participants  left  the  computer.  Restricting  our  attention  to  the  other  61 
participants  who  took  at  most  20  minutes  we  get  an  adjusted  average  completion 
time  of  6.2  minutes. 

Fifty-eight  of  our  participants  returned  to  complete  phase  2  by  taking  our 
matching  test.  It  took  these  participants  4.5  minutes  on  average  to  complete  the 
matching  test.  Seventeen  of  our  participants  correctly  matched  all  ten  of  their 
labels,  and  69%  of  participants  matched  at  least  5  out  of  ten  labels  correctly.  Our 
results  are  summarized  in  Table  6.2. 

9Our  study  protocol  was  approved  for  exemption  by  the  Institutional  Review  Board  (IRB)  at 
Carnegie  Mellon  University  (IRB  Protocol  Number:  HS13-219). 
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Phase  1 

Phase  2 

Average 

9.3 

4.5 

StdDev 

9.6 

3 

Max 

57.5 

18.5 

Min 

1.4 

1.6 

Average  <  20 

6.2 

N/A 

Table  6.1:  Completion  Times 


u-accurate 

#  partici¬ 
pants 

#  participants 

58 

|{tc'  |  dw(n,n')<a}\ 

10! 

o 

II 

17 

0.29 

2.76  x  10“7 

a  =  2 

22 

0.38 

1.27  xl0“5 

CO 

II 

26 

0.45 

7.88  x  10“5 

II 

34 

0.59 

6.00  x  10“4 

a  =  5 

40 

0.69 

3.66  x  10“3 

Table  6.2:  Usability  Results:  Fraction  of  Participants  who  would  have  authenti¬ 
cated  with  accuracy  parameter  a 


Discussion  Our  user  study  provides  evidence  that  our  construction  is  at  least 
(0, 0.29)-usable  or  (5, 0.69)-usable.  While  this  means  that  our  Inkblot  Matching 
GOTCHA  could  be  used  by  a  significant  fraction  of  the  population  to  protect 
their  passwords  during  authentication  it  also  means  that  the  use  of  our  GOTCHA 
would  have  to  be  voluntary  so  that  users  who  have  difficulty  won't  get  locked  out 
of  their  accounts.  Another  approach  would  be  to  construct  different  GOTCHAs 
and  allow  users  to  choose  which  GOTCHA  to  use  during  authentication. 

Study  Incentives:  There  is  evidence  that  the  lack  of  monetary  incentives  to 
perform  well  on  our  matching  test  may  have  negatively  influenced  the  results  (e.g., 
some  participants  may  have  rushed  through  phase  1  of  the  study  because  their 
payment  in  round  2  was  independent  of  their  ability  to  match  their  labels  correctly). 
For  example,  none  of  our  18  fastest  participants  during  phase  1  matched  all  of 
their  labels  correctly,  and  —  excluding  participants  we  believe  left  their  computer 
during  phase  1  (e.g.,  took  longer  than  20  minutes)  —  on  average  participants  who 
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failed  to  match  at  least  five  labels  correctly  took  2  minutes  less  time  to  complete 
phase  1  than  participants  who  did. 

Time:  We  imagine  that  some  web  services  maybe  reluctant  to  adopt  GOTCHAs 
out  of  fear  driving  away  customers  who  don't  want  to  spend  time  labeling  Inkblot 
images  [76].  However,  we  believe  that  for  many  high  security  applications  (e.g., 
online  banking)  the  extra  security  benefits  of  GOTCHAs  will  outweigh  the  costs 
—  GOTCHAs  might  even  help  a  bank  keep  its  customers  by  providing  extra  as¬ 
surance  that  users'  passwords  are  secure.  We  are  looking  at  modifying  our  Inkblot 
generation  algorithm  to  produce  Inkblots  which  require  less  "mental  effort"  to 
label.  In  particular  could  techniques  like  Perlin  Noise  [119]  be  used  to  generate 
Inkblots  that  can  be  labeled  more  quickly  and  matched  more  accurately? 

Accuracy:  We  believe  that  the  usability  of  our  Inkblot  Matching  GOTCHA 
construction  can  still  be  improved.  One  simple  way  to  improve  the  usability  of 
our  GOTCHA  construction  would  be  to  allow  the  user  to  reject  Inkblot  images  that 
were  confusing.  We  also  believe  that  usability  could  be  improved  by  providing 
users  with  specific  strategies  for  creating  their  labels  (e.g.,  we  found  that  simple 
labels  like  "a  voodoo  mask"  were  often  mismatched,  while  more  elaborate  stories 
like  "A  happy  guy  on  the  ground,  protecting  himself  from  ticklers"  were  rarely 
mismatched). 


6.3.3  An  Open  Challenge  to  the  AI  Community 

We  envision  a  rich  interaction  between  the  security  community  and  the  artificial 
intelligence  community.  To  facilitate  this  interaction  we  present  an  open  challenge 
to  break  our  GOTCHA  scheme. 


Challenge  Setup  We  chose  several  random  passwords  (pw\,  ...,pw±)  ~  [0, 107} 
and  pw5  ~  [0, 108}.  We  used  a  function  Generatelnkblots  (pwir  10)  to  generate  ten 
inkblots  I],  ...,/]0  for  each  password,  and  we  had  a  human  label  each  inkblot  image 

4>  4>,  Uo).  We  selected  a  random  permutation  tt*  :  [10]  — » 

[10]  for  each  account,  and  generated  the  tuple 

Ti  =  (si,h  (pwi,si,ni(l),  ...,ni(l0))  ...,fni{w^  , 

where  s*  is  a  randomly  selected  salt  value  and  h  is  a  cryptographic  hash  function. 
We  are  releasing  the  source  code  that  we  used  to  generate  the  Inkblots  and  evaluate 
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the  hash  function  H  along  with  the  tuples  T ...,  T5  —  see 
http : //www . cs . emu . edu/~  jblocki/GOTCHA- Challenge . html. 

Challenge:  Recover  each  password  pw{. 


Approaches  One  way  to  accomplish  this  goal  would  be  to  enumerate  over  ev¬ 
ery  possible  password  guess  pzv.  and  evaluate  H(pzp',S;,  7z(l), ...,  7i(10))  for  every 
possible  permutation  n  :  [10]  — »  [10].  However,  the  goal  of  this  challenge  is  to 
see  if  AI  techniques  can  be  applied  to  attack  our  GOTCHA  construction.  We  in¬ 
tentionally  selected  our  passwords  from  a  smaller  space  to  make  the  challenge 
more  tractable  for  AI  based  attacks,  but  to  discourage  participants  from  trying 
to  brute  force  over  all  password/permutation  pairs  we  used  BCRYPT  (Level  15)10 
—  an  expensive  hash  function  —  to  encrypt  the  passwords.  Our  implementation 
allows  the  Inkblot  images  to  be  generated  very  quickly  from  a  password  guess 
pw'  so  an  AI  program  that  can  use  the  labels  in  the  password  file  to  distinguish 
between  the  correct  Inkblots  returned  by  Generatelnkblots  (pzv„  10)  and  incorrect 
Inkblots  returned  by  Generatelnkblots  [pwf  10j  would  be  able  to  quickly  dismiss 
incorrect  guesses.  Similarly,  an  AI  program  which  generates  a  small  set  of  likely 
permutations  for  each  password  guess  could  allow  an  attacker  to  quickly  dismiss 
incorrect  guesses. 


6.4  Analysis:  Cost  of  Offline  Attacks 

In  this  section  we  argue  that  our  password  scheme  (protocols  6.3.2  and  6.3.1) 
significantly  mitigates  the  threat  of  offline  attacks.  An  informal  interpretation 
of  our  main  technical  result  —  Theorem  21  —  is  that  either  (1)  the  adversary's 
offline  attack  is  prohibitively  expensive  (2)  there  is  a  good  chance  that  adversary's 
offline  attack  will  fail,  or  (3)  the  underlying  GOTCHA  construction  can  be  broken. 
Observe  that  the  security  guarantees  are  still  meaningful  even  if  the  security 
parameters  e  and  6  are  not  negligibly  small. 

Theorem  21.  Suppose  that  our  user  selects  his  passzvord  uniformly  at  random  from  a  set 
D  (e.g.,  pzv  <—  D)  and  creates  his  account  using  protocol  6.3.1.  If  algorithms  6.2  and  6.3 

10The  level  parameter  specifies  the  computation  complexity  of  hashing.  The  amount  of  work 
necessary  to  evaluate  the  BCRYPT  hash  function  increases  exponentially  with  the  level  so  in  our 
case  the  work  increases  by  a  factor  of  215. 
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are  an  (e,  <5,  / r)-GOTCHA  then  no  conservative  offline  adversary  is  (c,y  +  e  +  5  +  p,  d)- 
successful  for  C  <  y\D\2^k)cH  +  nHcH 

Proof  of  Theorem  21.  (Sketch)  We  use  a  hybrid  argument.  An  adversary  who 
breaches  the  server  is  able  to  recover  the  tuple  t  =  (i i,  r' ,  s,  H  ( u ,  s,  pw,  71(1), ... ,  n(k)) ,  £n(\),  ■  ■ 
as  well  as  the  code  for  the  cryptographic  hash  function  H  and  the  code  for  our 
GOTCHA  —  (Gi,  G2). 

1.  World  0:  Wo  denotes  the  real  world  in  which  the  adversary  has  recovered 
the  tuple 


t0  =  (u,  r' ,  s,  H  (u,  s,  piv,  7i(l), . . . ,  n(k))  /  £n(l)/  •  •  ■  /  £n(k 

as  well  as  the  code  for  the  cryptographic  hash  function  H  and  the  code  for 
our  GOTCHA  —  (Gi,  G?).  Because  the  adversary  Adv  is  conservative  it 
constructs  the  function 


VerifyHash  {pw' ,  n') 


1  if  pw'  =  pw  and  n'  -  n 
0  otherwise. 


and  uses  VerifyHash  as  a  blackbox.  We  say  that  Adv  queries  a  human  H 
about  password  pw'  if  it  queries  H  for  H  (Generatelnkblotlmages  (lk,  Extract  (pw',  r' 
and  we  let  D'  c  D  denote  the  set  of  passwords  for  which  the  adversary 
queries  a  human. 


2.  World  1:  Wi  denotes  a  hypothetical  world  that  is  similar  to  W0  except  that 
VerifyHash  function  the  adversary  uses  as  a  blackbox  is  replaced  with  the 
following  incorrect  version 


VerifyHash1  ( pw',n ') 


1  if  pw'  i  D',  pw'  =  pw  and  n'  -  n 
0  otherwise. 


where  D'  c  D  is  a  subset  of  passwords  which  denotes  the  set  of  passwords 
for  which  the  adversary  makes  queries  to  a  human  in  the  real  world. 

3.  World  2:  W2  denotes  a  hypothetical  world  that  is  similar  to  Wi  except  that 
VerifyHash1  function  the  adversary  uses  as  a  blackbox  is  replaced  with  the 
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following  incorrect  version 


VerifyHash2  ( pw',  n') 


1  if  n'  =  R  (G?  (lk,  Extract  (pw',  r'),€  4)), 

pw'  £  D'  and  pw'  =  pw  / 
0  otherwise. 


where  R  is  a  distribution  with  minimum  entropy  p(k)  as  in  definition  15. 

4.  World  3:  W3  denotes  a  hypothetical  real  world  which  is  similar  to  world  2, 
except  that  the  labels  4(1)/  •  •  • ,  4©  are  replaced  with  the  labels  4  ,(1),  •  •  •  /  4  ,(jt)/ 
where  n'  :  [k]  —>  [k]  is  a  new  random  permutation,  and  the  labels  4  are  for  a 
completely  unrelated  set  of  Inkblot  challenges 

£'v ...  ,£k  <—  H (Gi  (lk, Xi, x2))  / 

where  X\,Xi  e  {0, 1}"  are  freshly  chosen  random  value. 


In  world  3  it  is  easy  to  bound  the  adversary's  probability  of  success.  No  adversary 
is  (C,  y,  D)-successful  for  C  <  y|D|2^®cH/  because  the  fake  Inkblot  labels  are  not 
correlated  with  the  actual  Inblots  that  were  generated  with  the  real  password.  Our 
particular  adversary  cannot  be  (C,  y,  D)-successful  for  C  <  y|D|2^cH  +  \D'\ch-  In 
world  2  the  adversary  might  improve  his  chances  of  success  by  looking  at  the 
Inblot  labels,  but  by  definition  of  (a,  /3,  e,  5,  p)-GOTCHA  his  chances  change  by  at 
most  5.  In  world  1  the  adversary  might  further  improve  his  chances  of  success,  but 
by  definition  of  (a,  /3,  e,  b,  p)-GOTCHA  his  chances  improve  by  at  most  e.  Finally, 
in  world  0  the  adversary  improves  his  chances  by  at  most  |D'|/|D|  by  querying  the 
human  about  passwords  in  D' .  □ 


6.5  Discussion 

We  conclude  by  discussing  some  key  directions  for  future  work. 


Improved  Inkblots  One  way  to  improve  our  GOTCHA  construction  would  be 
to  improve  the  Inkblot  generation  algorithm.  One  idea  is  to  use  random  walks 
to  generate  Inkblots[129]  instead  of  adding  colored  ellipses  with  random  sizes, 
locations  and  orientations  (e.g..  Figure  6.1).  Figure  6.4a  is  an  example  of  an  Inkblot 
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produced  with  random  walks.  The  hope  is  that  users  will  find  it  easier  to  label 
these  Inkblot  images.  Another  potential  improvement  would  be  to  have  the  user 
identify  and  highlight  several  specific  objects  in  his  Inkblot  image(s)  during  ac¬ 
count  creation  (see  Figure  6.4b).  When  the  user  authenticates  he  would  be  asked 
to  click  on  each  of  these  objects  (e.g.,  click  on  the  "Bunny  Ears").  One  advantage  is 
that  we  may  be  able  to  provide  equivalent  security  guarantees  by  having  the  user 
specify  two  or  three  specific  objects  in  the  Inkblot  images  instead  of  requiring  the 
user  to  label  and  match  ten  different  Inkblot  images.  While  the  author  of  this  thesis 
has  personally  found  these  random  walk  Inkblots  much  easier  to  label  than  the 
Inkblots  described  earlier,  we  have  not  yet  conducted  a  user  study  to  empirically 
evaluate  these  potential  improvements. 


Other  GOTCHA  Constructions  Because  GOTCHAs  allow  for  human  feedback 
during  puzzle  generation  —  unlike  HOSPs  [51]  —  our  definition  potentially  opens 
up  a  much  wider  space  of  potential  GOTCHA  constructions.  One  idea  might  be 
to  have  a  user  rate/rank  random  items  (e.g.,  movies,  activities,  foods).  By  allowing 
human  feedback  we  could  allow  the  user  to  dismiss  potentially  confusing  items 
(e.g.,  movies  he  hasn't  seen,  foods  about  which  he  has  no  strong  opinion).  There 
is  some  evidence  that  this  approach  could  provide  security  (e.g.,  Narayanan  and 
Shmatikov  showed  that  a  Netflix  user  can  often  be  uniquely  identified  from  a  few 
movie  ratings  [114].). 


Obfuscating  CAPTCHAs  If  it  were  possible  to  efficiently  obfuscate  programs 
then  it  would  be  easy  to  construct  GOTCHAs  from  CAPTCHAs  (e.g.,  just  obfus¬ 
cate  a  program  that  returns  the  CAPTCHA  without  the  answer).  Recently,  Garg 
et  al.  showed  how  to  obfuscate  arbitrary  programs  [80]  using  multilinear  maps11. 
Unfortunately,  their  obfuscator  is  not  yet  efficient  enough  for  practical  use.  How¬ 
ever,  it  may  be  still  be  possible  to  find  an  efficient  way  to  obfuscate  our  particular 
CAPTCHA  program. 

11  While  Barak  et  al.  [24]  showed  that  there  is  no  general  program  obfuscator,  their  impossibility 
result  was  for  a  stronger  notion  of  obfuscation  called  blackbox  obfuscation,  which  requires  that  any 
adversary  with  access  to  an  obfuscated  program  can  be  simulated  with  only  blackbox  access  to  the 
same  program.  Garg  et  al.  [80]  used  a  weaker  notion  of  obfuscation  known  as  "indistinguishability 
obfuscation,"  which  (loosely)  only  guarantees  that  the  adversary  cannot  distinguish  between  the 
obfuscations  of  two  circuits  which  compute  the  same  function. 
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Exploiting  The  Power  of  Interaction  Can  interaction  be  exploited  and  used  to 
improve  security  or  usability  in  human-authentication?  While  interaction  is  an 
incredibly  powerful  tool  in  computer  security  (e.g.,  nonces  [128],  zero-knowledge 
proofs  [85],  secure  multiparty  computation  [163])  and  in  complexity  theory12, 
human  authentication  typically  does  not  exploit  interaction  with  the  human  (e.g., 
the  user  simply  enters  his  password).  We  view  the  idea  behind  HOSPs  and 
GOTCHAs  —  exploiting  interaction  to  mitigate  the  threat  of  offline  attacks  —  as 
a  positive  step  in  this  direction.  Could  interaction  be  exploited  to  reduce  memory 
burden  on  the  user  by  allowing  a  user  to  reuse  the  same  secret  to  authenticate 
to  multiple  different  servers?  The  human-authentication  protocol  of  Hopper  et 
al.  [91]  —  based  on  the  noisy  parity  problem  —  could  be  used  by  a  human  to 
repeatedly  authenticate  over  an  insecure  channel.  Unfortunately,  the  protocol  is 
slow  and  tedious  for  a  human  to  execute,  and  it  can  be  broken  if  the  adversary  is 
able  to  ask  adaptive  parity  queries  [103]. 


12 A  polynomial  time  verifier  can  verify  PSPACE-complete  languages  by  interacting  with  a  pow¬ 
erful  prover  [138],  by  contrast  the  same  verifier  can  only  check  proofs  of  NP-Complete  languages 
without  interaction. 
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Using  Your  Imagination  Enter  a  title  for  the  Image  Below  * 
Aunt  Martha  wants  to  squeeze  your  cheeks 


Figure  6.2:  Phase  1 


10.  wreck  it  ralph 


Which  title  did  you  use  for  the  Image  Above?  * 
fruit  with  flies  around  it 


1.  Bane  from  -Batman 

2.  Large  Dog  with  Giant  Ears 

3.  Old  Cow  Guy 

4.  Unhappy-  little  guy  in  the- center 


with  flies  around  it 

8.  guy  on  a  haug  glider 

9.  guy  with  head  and  eyes  it 
10.  wreck-it  ralph 


the  center  of  his  steroid  body 


Which  title  did  you  use  for  the  Image  Above?  * 
i  guy  on  a  hang  glider 


Figure  6.3:  Phase  2 


Two  Bicyclists  Pedal  Away 
from  a  Tall  bunny  in  a  Suit 

Bunny  Ears 

Bicyclists  Eyes 

Bicyclists  Butt 


(a)  Example  Random  Walk  Inkblot.  (b)  Example  Inkblot  with  Labels. 

Figure  6.4:  Random  Walk  Inkblots 
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Chapter  7 


Appendix:  Naturally  Rehearsing 
Passwords 
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7.1  Missing  Proofs 


Before  we  prove  Lemma  1  and  Theorem  1  we  first  formally  define  a  Poisson  arrival 
process  (Definition  18)  and  state  a  few  basic  facts  about  a  Poisson  arrival  process. 

Definition  18.  Given  0  <  t\  <  t2  we  use  Visits,-  (L,  t2)  =  {/'  t'.  e  (fi,  ^2)}]  to  denote 
the  number  of  times  the  user  visits  account  A,-  during  the  interval  (t\,  t2).  We  say  that 
Visits,-  (h,  t2 )  represents  a  Poisson  arrival  process  with  parameter  Ai  if 

Pr  [Visits,  (tlrt2)  =  k]  =  gA,(tl-t2)(MlzM 

kl 

and  the  random  variables  Visits,  (L,  t2)  and  Visits,-  ( f3 ,  f4)  are  independent  whenever 
0  <  t\  <  f2  —  b  —  b- 

Fact  4  says  that  1/A,  represents  the  average  inter-visitation  time  for  account  A, 
whose  visitation  schedule  follows  a  Poisson  arrival  process  with  parameter  A,. 

Fact  4.  If  Visits,-  (L,  t2)  represents  a  Poisson  arrival  process  with  parameter  A,-  then  1/A, 
represents  the  average  inter-visitation  time  for  account  A,-.  More  formally,  for  all  j  >  Owe 
have  E  [r'.  -  t'_J  =  f. 

Fact  5  says  that  the  sum  of  two  Poisson  arrival  processes  with  parameters  A\ 
and  A2  is  itself  a  Poisson  arrival  process  with  parameter  (A4  +  A2). 

Fact  5.  If  Visits,-  (t\,  t2)  represents  a  Poisson  arrival  process  with  parameter  Ai  and 
Visits  j  (t],  t2)  represents  an  independent  Poisson  arrival  process  with  parameter  Aj  then 
Visits,  ,  (f  i,  t2)  =  Visits,-  (L,  L)+ Visits,  (t  \,  t2)  is  a  Poisson  arrival  process  with  parameter 
(A,-  +  Ay). 

Reminder  of  Lemma  1.  Let  S?  =  {i  \  c  £  c,-}  and  let  Ac  =  L,esf  ^  then  the  probability 
that  the  cue  c  is  not  naturally  rehearsed  during  time  interval  [a,  b ]  is  exp  {—At  ( b  -  a)). 

Proof  of  Lemma  1.  Let  N  (t  \,  t2)  =  |t/  i  e  Sc  A  t  \  <  t/  <  t2j|  denote  the  number  of 
times  the  cue  c  is  rehearsed  during  the  interval  [f|,  t2].  Notice  that  the  rehearsal 
requirement  [a,  b]  is  naturally  satisfied  if  and  only  if  N{a,b)  >  0.  By  P act  5,  N{tlrt2)  = 
ZieS;  Visits,  (L,  t2)  describes  a  Poisson  arrival  process  with  parameter  At  =  Z,esf  ^ i 
so  we  can  apply  the  definition  of  a  Poisson  arrival  process  to  get 
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Pr  [N  (a,  b)  =  0]  =  exp  (-At  ( b  -  a))  . 


□ 


Reminder  of  Theorem  1. 


Let  it* 


=  (arg 


max,-  tcx  <  t )  -  1  then 


E[XR,]  =  ££ 

ceC  i= 0 


r 

/  > 

\ 

exp 

- 

La> 

(i,  -  tf) 

V 

(j:cecj  ) 

/ 

Proof  of  Theorem  1.  Let  St  =  {i  \  c  e  c,  }  and  let  Vaj,  ( c )  be  the  indicator  for  the  event 
that  3/  G  St,k  G  N.t[  G  (e.g.,  cue  c  is  rehearsed  naturally  during  the  time 
interval  [a,  b]).  Then  by  linearity  of  expectation 


E[XRw]  =  ^(l-E[y,, ,„(£)])  , 

i=0 


where 


k* 

(  \ 

\ 

E[i-i/w,.,(a]  =  XexP 

- 

■' — . 

< 

(f?«  -  *f) 

/'= 0 

V 

{jxecj  ) 

/ 

by  Lemma  1.  The  result  follows  immediately  from  linearity  of  expectation. 


□ 


Reminder  of  Theorem  4.  It  is  NP-Hard  to  approximate  Min-Rehearsal  within  a 
constant  factor. 

Proof  of  Theorem  4.  Let  y  >  0  be  any  constant.  We  prove  that  it  is  NP-Hard  to  even 
y-approximate  Min-Rehearsal.  The  reduction  is  from  set  cover. 


Set  Cover  Instance:  Sets  and  universe  U  =  (J ,-S,.  A  set  cover  is  a  set 

S  c  {1, ...,  n}  such  that  (J/eS  S;  =  U. 

Question:  Is  there  a  set  cover  of  size  k? 


Given  a  set  cover  instance,  we  set  C  =  U  create  public  cues  Ci,...,cm  c  C  for 
each  account  by  setting  c,  =  S,.  We  set  the  following  visitation  schedule 

ln(y  \U\  (maxteCi*)) 

-  TZ  77  / 

mmjrt(tcj+1-P) 
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for  i  =  1 ,k  and  Ak+i,...,An  =  0.  There  are  two  cases:  (1)  There  is  a  set  cover 
S  =  {x\, ... ,xk }  £  {1, ...,  n }  of  size  k.  If  we  assign  n(i)  =  Xj  for  each  i  <  k  then  for  each 
base  cue  c  e  U  we  have 

A i  ^  Ai  . 

i:ceSi 

Applying  Theorem  1  we  get 


E[XRf]  = 


< 


< 

< 


ZZex p 

ceC  ;=0 


■(&.-<?)  L  A< 


i:ceS 


no)  y 


|C|  max  it  exp 


ceC 


A 

-(Cl-* 

)  / 

V 

(r 

ma%C! 


;)) 


|U|(max?)exp(-ln(y|U|(max«t))) 


I  til  max  i\ 


ceC 


y  \U\  (maxfec  ifj 


1 

y 


(2)  If  there  is  no  set  cover  of  size  k.  Given  a  mapping  n  we  let  Sn  =  [i  \  3/  <  k.n(j)  =  i\ 
be  the  set  of  all  public  cues  visited  with  frequency  at  least  A\.  Because  IS^I  =  k, 

S„  cannot  be  a  set  cover  and  there  exists  some  c;  £  C  which  is  never  visited  so  no 
rehearsal  requirements  are  satisfied  naturally. 


E[XR,]  =  XI>P  £  A,  >£i>! 


ceC  i= 0 


i:ceS 


n(i) 


i= 0 


□ 

Reminder  of  Theorem  2.  Let  {ci, . . . ,  cmj  be  a  (n,  £,  y)-sharing  set  of  m  public  cues 
produced  by  the  password  management  scheme  Qm.  If  each  ai  £  LUS  is  chosen  uniformly 
at  random  then  Qm  satisfies  (q ,  b,  m,  s,  r,  Insecurity  for  5  <  f  and  any  h. 

Proof  of  Theorem  2.  Recall  that  S  (resp.  S')  denotes  the  set  of  accounts  that  the 
adversary  selected  for  plaintext  recovery  attacks.  Let  (k,p'k^  denote  the  adversary's 
final  answer.  We  can  assume  that  k  S  because  the  adversary  cannot  win  by 
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outputting  a  password  he  obtained  earlier  in  the  game  during  a  plaintext  recovery 
attack.  We  define 


Uk  =  Ck  -  {c  3jeS.ce  c;j  , 


to  be  the  set  of  all  uncompromised  base  cues  in  ck.  Observe  that 


m  >  \ck\  -  Yj  \ck  n  ci 

jeS 

*  t-L? 

jeS 

>  £  -ry , 


by  definition  5  of  a  ( n ,  £,  y)-sharing  family  of  public  cues. 

For  each,  c  e  Uk  the  corresponding  association  a  was  chosen  uniformly  at 
random  from  £3S.  We  can  upper  bound  B-/(  —  the  bad  event  that  the  adversary 
guesses  (k,pk)  in  at  most  q  attempts. 


Pr[fh*]< 


ms\lUkl 


q 


□ 


Reminder  of  Theorem  3.  Suppose  that  S  =  {Si, ...,  Sm}  is  a  ( n ,  C,  y)-sharing  set  family 
of  size  m  then  m  <  (yn+1)/(y€+1). 

Proof  of  Theorem  3.  Let  S  e  S  be  given,  and  let  T  c  S  be  subset  of  size  |T|  =  y  +  l. 
By  definition  of  (n,  £,  y {-sharing  we  cannot  have  T  Q  S'  for  any  other  set  S'  e  S  —  S. 
In  total  there  are  (  "1)  subsets  of  [ n ]  of  size  y  +  1  and  each  S  e  S  contains  (  ^)  of 
them.  The  result  follows  from  the  pigeonhole  principle.  □ 


7.2  Varying  the  Association  Strength  Constant 

In  Tables  2.2  and  2.3  we  used  the  same  association  strength  constant  for  each 
scheme  a  =  1  —  though  we  expect  that  a  will  be  higher  for  schemes  like  Shared 
Cues  that  use  strong  mnemonic  techniques.  We  explore  the  effect  of  o  on  E  [Xkf  c] 
under  various  values  of  the  natural  rehearsal  rate  A.  Table  7.1  shows  the  values 
IE  [XRt  c]  under  the  expanding  rehearsal  assumption  for  a  e  {0.1. 0.5, 1,2}.  We 


165 


A  (visits/days) 

2 

1 

1 

3 

1 

7 

1 

31 

a  =  0.1 

0.686669 

2.42166 

5.7746 

7.43555 

8.61931 

a  =  0.5 

0.216598 

0.827594 

2.75627 

4.73269 

7.54973 

0  =  1 

0.153986 

0.521866 

1.56788 

2.61413 

4.65353 

a  =  2 

0.135671 

0.386195 

0.984956 

1.5334 

2.57117 

Table  7.1:  Expanding  Rehearsal  Assumption:  E  [XR365  c\  vs.  Ac  and  o 


A  (visits/days) 

2 

1 

1 

3 

1 

7 

1 

31 

0  =  1 

49.5327 

134.644 

262.25 

317.277 

354.382 

o  =  3 

0.3024 

6.074 

44.8813 

79.4756 

110.747 

o  =  7 

0.0000 

0.0483297 

5.13951 

19.4976 

42.2872 

0  =  31 

0.000 

0.0000 

0.0004 

0.1432 

4.4146 

Table  7.2:  Constant  Rehearsal  Assumption:  E  [XR365 /C]  vs.  Ac  and  0 


consider  the  following  natural  rehearsal  rates:  A  =  1  (e.g.,  naturally  rehearsed 
daily),  A  =  3,  A  =  7  (e.g.,  naturally  rehearsed  weekly),  A  =  31  (e.g.,  naturally 
rehearsed  monthly). 

Table  7.2  shows  the  values  E  [XRt c]  under  the  constant  rehearsal  assumption 
for  o  £  {1, 3, 7, 31}  (e.g.,  if  o  =  7  then  the  cue  must  be  rehearsed  every  week). 


7.3  Baseline  Password  Management  Schemes 

In  this  section  we  formalize  our  baseline  password  management  schemes:  Reuse 
Weak  (Algorithm  7.1),  Reuse  Strong  (Algorithm  7 .2) ,Lifehacker  (Algorithm  7.3)  and 
Strong  Random  and  Independent  (Algorithm  7.4).  The  first  three  schemes  ( Reuse 
Weak,Reuse  Strong ,Lifehacker)  are  easy  to  use,  but  only  satisfy  weak  security  guar¬ 
antees.  Strong  Random  and  Independent  provides  very  strong  security  guarantees, 
but  is  highly  difficult  to  use. 

Vague  instructions  and  strategies  do  not  constitute  a  password  management 
scheme  because  it  is  unclear  what  the  resulting  distribution  over  P  looks  like. 
When  given  such  vague  instructions  (e.g.,  "pick  a  random  sentence  and  use  the 
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first  letter  of  each  word")  people  tend  to  behave  predictably  (e.g.,  picking  a  pop¬ 
ular  phrase  from  a  movie  or  book).  For  example,  when  people  are  required  to 
add  special  symbols  to  their  passwords  they  tend  to  use  a  small  set  of  random 
symbols  and  add  them  in  predictable  places  (e.g.,  end  of  the  password)  [101]. 
Most  password  advice  provides  only  vague  instructions.  However,  many  of  these 
vague  strategies  can  be  tweaked  to  yield  formal  password  management  schemes. 
Reuse  Weak,  Reuse  Strong,  and  Lifehacker  are  formalizations  of  popular  password 
management  strategies. 

Each  of  these  password  management  schemes  ignores  the  visitation  schedule 
Ai,...,  Am.  None  of  the  schemes  use  cues  explicitly.  However,  the  user  always  has 
an  implicitly  cue  when  he  tries  to  login.  For  example,  the  implicit  cue  in  Reuse 
Weak  might  be  "that  word  that  I  always  use  as  my  password."  We  use  four  implicit 
cues  for  Reuse  Strong  to  represent  the  use  of  four  separate  words  (chunks  [108]). 
These  implicit  cues  are  shared  across  all  accounts  —  a  user  rehearses  the  implicit 
association(s)  when  he  logs  into  any  of  his  accounts. 


Algorithm  7.1  Reuse  Weak  Qm 

Input:  Background  knowledge  k  £  7C  about  the  user.  Random  bits  b,  A i, ...,  A„;. 

$ 

Random  Word:  w  <—  020,000-  >  Select  w  uniformly  at  random  from  a  dictionary 
of  20,000  words, 
for  i  -  1  — >  m  do 

Pi  <—  w 

Ci  <—  {'word'} 
return  {px,cf),..., ( pm,cm ) 

User:  Memorizes  and  rehearses  the  cue-association  pairs  (' word' ,pi )  for  each 
account  A;  by  following  the  rehearsal  schedule  (e.g.,  CR  or  ER). 


Lifehacker  uses  a  derivation  rule  to  get  a  different  password  for  each  account. 
There  is  no  explicit  cue  to  help  the  user  remember  the  derivation  rule,  but  the 
implicit  cue  (e.g.,  "that  derivation  rule  I  always  use  when  I  make  passwords")  is 
shared  across  every  account  —  the  user  rehearses  the  derivation  rule  every  time 
he  logs  into  one  of  his  accounts.  There  are  four  base  cues  —  three  for  the  words, 
one  for  the  derivation  rule. 

Strong  Random  and  Independent  also  uses  implicit  cues  (e.g.,  the  account  name 
A{),  which  are  not  shared  across  accounts  so  the  only  way  to  naturally  rehearse  the 
association  (A,,  pt)  is  to  visit  account  A,-. 
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Algorithm  7.2  Reuse  Strong  Qm 

Input:  Background  knowledge  k  £  %  about  the  user.  Random  bits  b,  A\, ...,  Am. 
for  i  =  1  — »  4  do 

$ 

Random  Word:  w,  <—  D2 o,ooo- 
for  i  -  1  — >  m  do 

Pi  <—  ZV1W2W3W4 

Ci  <-  {('Word',])  | ;  G  [4]} 
return  (pi,Ci),  ...,(p  m/  ^m) 

User:  Memorizes  and  rehearses  the  cue-association  pairs  ((' Word',  j) ,  Wjj  for 
each  j  e  [4]  by  following  the  rehearsal  schedule  (e.g.,  CR  or  ER). 


Algorithm  7.3  Lifehacker  Qm 

Input:  Background  knowledge  k  £  %  about  the  user.  Random  bits  b,  A\, Am. 
for  i  =  1  — »  3  do 

$ 

Random  Word:  Wj  4-  D2o,ooo- 

$ 

Derivation  Rule:  d  DerivRules.  >  DerivRules  is  a  set  of  50  simple 

derivation  rules  to  map  the  name  of  a  site  At  to  a  string  d  (A,)  (e.g.,  use  the  first 
three  consonants  of  A;). 
for  i  -  1  — >  m  do 
pi  <—  zviW2W3d  ( Ai ) 
d  <-  {('Word',;)  | ;  G  [3]}  U  {'Rule'} 
return  (p^d) ,  ...,  (p 

m/  Cm) 

User:  Memorizes  and  rehearses  the  cue-association  pairs  ({'Word' ,  j)  ,Wj)  for 
each  j  £  [3]  and  (' Rule',d )  by  following  the  rehearsal  schedule  (e.g.,  CR  or  ER). 


Algorithm  7.4  Strong  Random  and  Independent  Qm 

Input:  Background  knowledge  k£<K  about  the  user.  Random  bits  b,  A\, ...,  Am. 
for  i  =  1  — >  m  do 
for  j  =  1  — »  4  do 

Random  Word:  wl.  <—  D2 0/0oo- 

Pi  <—  w\wl2utf3w\ 

Ci  <-  {(A,,;)  I;  €  [4]} 
return  (p^d),  ...,(p  m/  ^m) 

User:  Memorizes  and  rehearses  the  association  ((A„  j) ,  ic'J  for  each  account  A; 
and  j  £  [4]  by  following  the  rehearsal  schedule  (e.g.,  CR  or  ER). 


168 


7.3.1  Security  Of  Baseline  Password  Management  Schemes 

Reuse  Weak  is  not  b,  m,  s,  0,  Insecure  for  any  b  <  1  —  an  adversary  who  is  only 
willing  to  spend  $1  on  password  cracking  will  still  be  able  to  crack  the  user's 
passwords!  While  Reuse  Weak  does  provide  some  security  guarantees  against 
online  attacks  they  are  not  very  strong.  For  example.  Reuse  Weak  is  not  even 
(g$i,  .01, 100, 3, 0, 0)-secure  because  an  adversary  who  executes  an  online  attack  can 
succeed  in  breaking  into  at  least  one  of  the  user's  100  accounts  with  probability  at 
least  .01  —  even  if  all  accounts  implement  a  3-strike  limit.  If  the  adversary  recovers 
any  of  the  user's  passwords  (r  >  0)  then  all  security  guarantees  break  down. 

Reuse  Strong  is  slightly  more  secure.  It  satisfies  (g$10 6,3.222  x  10-7,  m,s,0,  m)- 
security  meaning  that  with  high  probability  the  adversary  who  has  not  been  able 
to  recover  any  of  the  user's  passwords  will  not  even  be  able  to  mount  a  successful 
offline  attack  against  against  the  user.  However,  Reuse  Strong  is  not  (q,  b,  m,  s,  1, 0)- 
secure  —  if  the  adversary  is  able  to  recover  just  one  password  p,  for  any  account 
A,  then  the  adversary  will  be  able  to  compromise  all  of  the  user's  accounts. 

Lifehacker  is  supposed  to  limit  the  damage  of  a  recovery  attack  by  using  a 
derived  string  at  the  end  of  each  password.  However,  in  our  security  model  the 
adversary  knows  that  the  user  used  Lifehacker  to  generate  his  passwords.  The 
original  article  [4]  instructs  users  to  pick  a  simple  derivation  rule  (e.g.,  "use  the 
first  three  consonants  in  the  site  name").  Because  this  instruction  is  vague  we 
assume  that  there  are  a  set  of  50  derivation  rules  and  that  one  is  selected  at 
random.  If  the  adversary  sees  a  password  p,  =  w-\  w2zv3d  (A,)  for  account  A,  then 
he  can  immediately  infer  the  base  password  b  =  W\V02w3l  and  the  adversary  needs 
at  most  50  guesses  to  discover  one  of  the  user's  passwords1  —  so  if  (m  -  l)s  > 
50  then  Lifehacker  is  not  (q,b,m,s,  1, 0)-secure  for  any  values  of  b,q.  Lifehacker  is 
(^$106,1.29  x  10-4,  m,  s,  0,  ra)-secure  —  it  defends  against  offline  and  online  attacks 
in  the  absence  of  recovery  attacks. 

Strong  Random  and  Independent  is  highly  secure!  It  satisfies  (q$106, 3.222  x 
10-7,  m,s,  a,  m)-security  for  any  a  <  m.  This  means  that  even  after  the  adversary 
learns  many  of  the  user's  passwords  he  will  fail  to  crack  any  other  password  with 
high  probability.  Unfortunately,  Strong  Random  and  Independentis  very  difficult  to 
use. 

1In  fact  the  adversary  most  likely  needs  far  fewer  guesses.  He  can  immediately  eliminate  any 
derivation  rule  d  s.t.  d  ( A, )  +  d  (A;).  Most  likely  this  will  include  almost  all  derivation  rules  besides 
the  correct  one. 
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7.3.2  Usability  of  Baseline  Schemes 


Usability  results  for  Lifehacker  and  Strong  Random  and  Independent  can  be  found 
in  Table  2.2  of  the  paper.  We  evaluate  usability  using  the  formula  from  Theorem 
1.  We  present  our  results  for  the  Very  Active,  Typical,  Occasional  and  Infrequent 
users  under  both  sufficient  rehearsal  assumptions  CR  and  ER  —  with  association 
strength  o  =  1.  The  usability  results  for  ReuseStrong  are  identical  to  Lifehacker, 
because  they  have  the  same  number  of  cues  and  each  cue  is  rehearsed  anytime  the 
user  visits  any  account  A,-.  Similarly,  the  usability  results  for  RenseWeak  are  better 
by  a  factor  of  4  (e.g.,  because  there  is  only  one  cue-association  pair  to  rehearse  and 
the  natural  rehearsal  rates  are  identical). 


7.3.3  Sources  of  Randomness 

Popular  password  advice  tends  to  be  informal  —  the  user  is  instructed  to  select  a 
character/number/digit/word,  but  is  not  told  how  to  do  this.  Certainly  one  reason 
why  people  do  not  select  random  passwords  is  because  they  worry  about  forget¬ 
ting  their  password  [102],  However,  even  if  the  user  is  told  to  select  a  the  character 
uniformly  at  random  it  is  still  impossible  to  make  any  formal  security  guarantees 
without  understanding  the  entropy  of  a  humanly  generated  random  sequence. 
We  have  difficulty  consciously  generating  a  random  sequence  of  numbers  even 
when  they  are  not  trying  to  construct  a  memorable  sequence  [154]  [111]  [74], 

This  does  not  rule  out  the  possibility  that  human  generated  random  sequence 
could  provide  a  weak  source  of  entropy  [88]  —  which  could  be  used  to  extract  a 
truly  random  sequence  with  computer  assistance  [65,  137],  We  envision  a  com¬ 
puter  program  being  used  to  generate  random  words  from  a  dictionary  or  random 
stories  (e.g..  Person- Action-Object  stories)  for  the  user  to  memorize.  The  source  of 
randomness  could  come  from  the  computer  itself  or  it  could  be  extracted  from  a 
human  source  (e.g.,  a  user  randomly  typing  on  the  keyboard). 


7.4  Other  Measures  of  Password  Strength 

In  this  section  we  discuss  other  security  metrics  (e.g.,  entropy,  minimum  entropy, 
password  strength  meters,  n-guesswork)  and  their  relationship  to  our  security 
model. 
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Our  security  model  is  fundamentally  different  from  metrics  like  guessing  en¬ 
tropy  (e.g..  How  many  guesses  does  an  adversary  need  to  guess  all  of  passwords 
in  a  dataset  [107]?)  and  partial  guessing  entropy  (e.g..  How  many  guesses  does  the 
adversary  need  to  crack  ^-fraction  of  the  passwords  in  a  dataset  [39,  121]?  How 
many  passwords  can  the  adversary  break  with  / 6  guesses  per  account  [45]?),  which 
take  the  perspective  of  a  system  administrator  who  is  trying  to  protect  many  users 
with  password  protected  accounts  on  his  server.  For  example,  a  system  adminis¬ 
trator  who  wants  to  evaluate  the  security  effects  of  a  a  new  password  composition 
policy  may  be  interested  in  knowing  what  fraction  of  user  accounts  are  vulnerable 
to  offline  attacks.  By  contrast,  our  security  model  takes  the  perspective  of  the  user 
who  has  many  different  password  protected  accounts.  This  user  wants  to  evaluate 
the  security  of  various  password  management  schemes  that  he  could  choose  to 
adopt. 

Our  threat  model  is  also  strictly  stronger  than  the  threat  models  behind  met¬ 
rics  like  n-guesswork  because  we  consider  targeted  adversary  attacks  from  an 
adversary  who  may  have  already  compromised  some  of  the  user's  accounts. 

Password  strength  meters  can  provide  useful  feedback  to  a  user  (e.g.,  they  rule 
out  some  insecure  password  management  schemes).  However,  password  strength 
meters  are  insufficient  for  our  setting  for  several  reasons:  (1)  They  fail  to  rule  out 
some  weak  passwords,  and  (2)  They  cannot  take  correlations  between  a  user's 
passwords  (e.g..  Is  the  user  reusing  the  same  password?)  into  account.  (3)  They 
do  not  model  the  adversaries  background  knowledge  about  the  user  (e.g..  Does 
the  adversary  know  the  user's  birth  date  or  favorite  hobbies?).  Entropy  is  bad 
measure  of  security  for  the  same  reasons.  While  minimum  entropy  fixes  some  of 
these  problems,  minimum  entropy  still  does  not  address  problem  2  —  minimum 
entropy  does  not  deal  with  correlated  user  passwords. 

7.4.1  Password  Strength  Meters 

Password  strength  meters  use  simple  heuristics  (e.g.,  length,  character  set)  to  es¬ 
timate  the  entropy  of  a  password.  A  password  strength  meter  can  provide  useful 
feedback  to  the  user  by  warning  the  user  when  he  picks  passwords  that  are  easy  to 
guess.  However,  password  strength  meters  can  also  give  users  a  false  sense  of  con¬ 
fidence  (e.g.,  'mmmmmmmmmmmmmmmmmmmmmmmmmmmm'  is  clearly 
predictable,  but  is  ranked  'Best'  by  some  meters  [2]  —  see  Figure  7.1  [2]).  A  pass¬ 
word  like  Mini !Mml  ’.Mini  ’.Mini !Mml  ’.Mini !  would  be  rated  as  very  secure  by 
almost  any  password  strength  meter  because  it  is  long,  it  uses  upper  case  and 
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lower  case  letters  and  it  includes  a  special  symbol  (!).  However,  the  password  is 
based  on  a  very  simple  repeated  pattern  and  has  low  entropy  (e.g.,  it  could  be 
compressed  easily).  A  password  strength  meter  cannot  guarantee  that  a  password 
is  secure  because  (1)  It  does  not  know  whether  or  not  the  user  has  already  used 
this  password  (or  a  very  similar  password)  somewhere  else  (2)  It  does  not  know  if 
the  user  is  basing  his  password  on  personal  knowledge  (e.g.,  wife's  birthday)  (3) 
It  does  not  know  what  background  knowledge  the  adversary  might  have  about 
the  user  (e.g.,  does  the  adversary  know  the  user's  wife's  birthday). 


Check  your  password — is  it  strong? 

Vour  online  Accounts,  computer  Tiles,  and  personal  information  are  more  secure  when  you  use  strong  passwords  to  help  protect  them. 


Test  the  strength  of  your  passwords:  Type  a  password  into  the  box. 


Password: 

Strength: 


Figure  7.1:  mmmmmmmmmmmmmmmmmmmmmmmmmmmm:  sounds  deli¬ 
cious,  but  is  it  really  a  strong  password? 


7.4.2  Entropy 

Entropy  [139]  can  be  used  to  measure  the  average  number  of  guesses  an  adversary 
would  need  to  guess  a  password  chosen  at  random  from  a  distribution  D  over 
passwords 

H(D)  =  £PrWD]l°g2(siy. 

While  entropy  has  been  a  commonly  used  information  theoretic  measure  of  pass¬ 
word  strength  [101,  109],  it  is  not  always  a  good  indicator  of  password  strength 
[107],  For  example,  consider  the  following  distributions  over  binary  passwords 
Dj  and  D2: 

D  f  1”_1  with  probability  1/2, 

1  1  x  G  [0,  l}2"-2  with  probability  2~2n+1. 

D2  (n)  -  x  G  [0, 1}”  with  probability  2~n  . 
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While  there  is  no  difference  in  the  entropy  of  both  generators 


H  (D,  («))  =  i  log2  ( A)  +  £  r2”+1  lo&  i2^')  =  \  +  =  "  =  H  (°2  (n)>  ’ 

Di  and  D2  are  by  no  means  equivalent  from  a  security  standpoint!  After  just  one 
guess  an  adversary  can  successfully  recover  the  password  generated  by  D1  with 
probability  >  |!  By  contrast  an  adversary  would  need  at  least  2"_1  guesses  to 
recover  the  password  generated  by  D2  with  probability  > 


7.4.3  Minimum  Entropy 

If  we  instead  consider  the  minimum  entropy 


Hmin(G)  =  minlog2^^j, 
of  both  generators  we  get  a  different  story 

Hmin  (Di  (ft))  =  log2  =  1  «:  Hmin  (D2  (ft))  =  log2  (2”)  =  ft  . 

High  minimum  entropy  guarantees  with  high  probability  any  adversary  will  fail 
to  guess  the  password  even  after  many  guesses.  However,  even  minimum  entropy 
is  not  a  great  measure  of  security  when  the  user  is  managing  multiple  passwords 
because  it  does  not  consider  correlations  between  passwords.  Suppose  for  example 
that  each  user  needs  two  passwords  (xi,x2)  and  again  consider  two  password 
distributions  Dj  and  D2  redefined  below: 

Di  (ft)  =  ( x,x )  with  probability  2~2n  for  each  x  e  {0,  l}2n  . 


D2  (ft)  =  (xi,x2)  with  probability  2  2,1  for  each  (xi,x2)  £  {0, 1}"  X  {0, 1}”  . 

The  min-entropy  of  both  generators  is  the  same  (2 ft).  However,  D\  provides 
no  security  guarantees  against  a  recovery  attack  —  any  adversary  who  knows  x2 
can  immediately  guess  X\.  However,  when  the  passwords  are  chosen  from  D2  an 
adversary  who  knows  x2  has  no  advantage  in  guessing  X\. 
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Benefit  (B) 

BCRYPT 

MD5 

SHA1 

qB 

B  | 

5.155  x  104) 

1 

B  (9.1  x  109) 

B  x  1010 

Table  7.3:  Upper  Bound:  qB  for  BCRYPT,  MD5  and  SHA1 


7.5  Economics 

In  this  section  we  discuss  how  the  parameter  qB  -  our  upper  bound  on  the  total 
number  of  adversary  guesses  -  could  be  selected.  Our  upper  bound  is  based  on 
the  economic  cost  of  guessing.  Guessing  is  not  free!  The  basic  premise  is  that 
the  adversary  will  not  try  more  than  q$B  guesses  to  break  into  an  account  if  his 
maximum  benefit  from  the  attack  is  $B.  The  cost  of  guessing  is  influenced  by 
several  factors  including  the  cost  of  renting  or  buying  computing  equipment  (e.g., 
Cray,  GPUs),  the  cost  of  electricity  to  run  the  computers  and  the  complexity  of 
the  cryptographic  hash  function  used  to  encrypt  the  password.  The  value  of  q$B 
depends  greatly  on  the  specific  choice  of  the  cryptographic  hash  function.  Table 
7.3  shows  the  values  of  q$B  we  computed  for  the  BCRYPT,  SHA1  and  MD5  hash 
functions. 


7.5.1  Password  Storage 

There  are  many  cryptographic  hash  functions  that  a  company  might  use  (e.g., 
MD5,  SHA1,  SHA2,  BCRYPT)  to  store  passwords.  Some  hash  functions  like 
BCRYPT  [122]  were  designed  specifically  with  passwords  in  mind  —  BCRYPT 
was  intentionally  designed  to  be  slow  to  compute  (e.g.,  to  limit  the  power  of  an 
adversary's  offline  attack).  The  BCRYPT  hash  function  takes  a  parameter  which 
allows  the  programmer  to  specify  how  slow  the  hash  computation  should  be 
—  we  used  L12  in  our  experiments.  By  contrast,  MD5,  SHA1  and  SHA2  were 
designed  for  fast  hardware  computation.  Unfortunately,  SHA1  and  MD5  are  more 
commonly  used  to  hash  passwords  [13].  In  economic  terms,  hash  functions  like 
BCRYPT  increase  the  adversary's  cost  of  guessing.  We  use  F h  to  denote  number  of 
times  that  the  hash  function  H  can  be  computed  in  one  hour  on  a  1  GHz  processor. 
We  estimated  FH  experimentally  on  a  Dell  Optiplex  960  computer  for  BCRYPT, 
MD5  and  SHA1  (Table  7.4)  —  as  expected  the  value  of  F H  is  much  lower  for 
BCRYPT  than  SHA1  and  MD5. 

The  rainbow  table  attack  can  be  used  to  significantly  speed  up  password  crack- 
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ing  attempts  after  the  adversary  performs  some  precomputation  [117].  Rainbow 
table  attacks  can  be  prevented  by  a  practice  known  as  password  salting  (e.g.,  in¬ 
stead  of  storing  the  cryptographic  hash  of  the  password  H  (p)  the  a  server  stores 
(H  (p,r),r)  for  a  random  string  r)  [16]. 

Note:  ,  In  reality,  many  companies  do  not  salt  their  passwords  [9, 13]  (in  fact  some 
do  not  even  hash  them  [5]).  In  this  paper,  we  assume  that  passwords  are  stored 
properly  (e.g.,  salted  and  hashed),  and  we  use  optimistic  estimates  for  q$B  based 
on  the  BCRYPT  hash  function.  To  justify  these  decisions  we  observe  that  a  user 
could  easily  ensure  that  his  passwords  are  salted  and  encrypted  with  a  slow  hash 
function  /  (e.g.,  BCRYPT  [122])  by  using  /  (U,  Aj,pi)  as  his  password  for  account  i 
-  where  U  is  the  username  and  A;  is  the  name  of  account  i.  Because  the  function 
/  is  not  a  secret,  its  code  could  be  stored  locally  on  any  machine  being  used  or 
publicly  on  the  cloud. 


7.5.2  Attack  Cost  and  Benefit 

Suppose  that  company  A,  is  hacked,  and  that  the  usernames  and  password  hashes 
are  stolen  by  an  adversary.  We  will  assume  that  company  A  has  been  following 
good  password  storage  practices  (e.g.,  company  A,  hashes  all  of  their  passwords 
with  a  strong  cryptographic  hash  function,  and  company  A,  salts  all  of  their 
password  hashes).  The  adversary  can  purchase  any  computing  equipment  he 
desires  (e.g.,  Cray  supercomputer,  GPUs,  etc)  and  run  any  password  cracker  he 
wants  for  as  long  as  he  wants.  The  adversary's  primary  limitation  is  money.  It 
costs  money  to  buy  all  of  this  equipment,  and  it  costs  money  to  run  the  equipment. 
If  the  adversary  dedicates  equipment  to  run  a  password  cracker  for  several  years 
then  the  equipment  may  be  obsolete  by  the  time  he  is  finished  (depreciation).  We 
define  Cg  to  be  the  amortized  cost  per  guesses  for  the  adversary. 


7.5.3  Cost  of  Guessing 

Included  in  the  amortized  guessing  cost  are:  the  price  of  electricity  and  the  cost  of 
equipment.  We  estimate  Cg  by  assuming  that  the  adversary  rents  computing  time 
on  Amazon's  cloud  EC2  [1].  This  allows  us  to  easily  account  for  factors  like  energy 
costs,  equipment  failure  and  equipment  depreciation.  Amazon  measures  rented 
computing  power  in  ECUs  [1]  —  "One  EC2  Compute  Unit  (ECU)  provides  the 
equivalent  CPU  capacity  of  a  1. 0-1.2  GEIz  2007  Opteron  or  2007  Xeon  processor." 
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Hash  Function  (H) 

Fh 

SHA1 

~  576  x  106  guesses  per  hour 

$1  x  10“10 

MD5 

~  561  x  106  guesses  per  hour 

$1.1  x  10“10 

BCRYPT  (L12) 

~  31  x  103  guesses  per  hour 

$1.94  x  10“5 

Table  7.4:  Guessing  Costs 


We  use  CgHz  to  denote  the  cost  of  renting  a  1  GHz  processor  for  1  hour  on  Amazon. 
We  have 

n  _  CgHz 
L  S  “  ~r  • 

Using  the  Cluster  GPU  Instance  rental  option  the  adversary  could  rent  33.5  ECU 
compute  units  for  $2.10  per  hour  (  CGhz  =  $-06). 

Our  results  are  presented  in  Table  7.4. 


7.5.4  Benefit 

The  benefit  Bj  of  cracking  an  account  Aj  is  dependent  on  both  the  type  of  account 
(e.g.,  banking,  e-mail,  commerce,  social  network,  leisure)  and  the  adversary's 
background  knowledge  about  the  user  (e.g..  Does  the  user  reuse  passwords?  Is 
the  user  rich?  Is  the  user  a  celebrity?). 

Password  reuse  has  a  tremendous  impact  on  B.  An  adversary  who  cracked  a 
user's  ESPN  account  would  likely  get  little  benefit  —  unless  the  user  reused  the 
password  elsewhere.  For  most  non-celebrities,  Bj  can  be  upper  bounded  by  the 
total  amount  of  money  that  the  user  has  in  all  of  his  financial  accounts.  In  fact,  this 
may  be  a  significant  overestimate  —  even  if  the  user  reuses  passwords  —  because 
banks  are  usually  successful  in  reversing  large  fraudulent  transfers  [77].  Indeed, 
most  cracked  passwords  sell  for  between  $4  and  $17  on  the  black  market  [79]. 
An  adversary  might  also  benefit  by  exploiting  the  user's  social  connections  (e.g., 
tricking  the  user's  friends  to  wire  money).  Some  user's  passwords  may  also  be 
valuable  because  have  access  to  valuable  information  (e.g.,  celebrity  gossip,  trade 
secrets). 

Most  users  should  be  able  to  safely  assume  that  no  adversary  will  spend  more 
than  $1,000,000  to  crack  their  account  even  if  they  reuse  passwords.  Table  7.5 
shows  the  value  of  <7$i,ooo,ooo  for  various  hash  functions. 
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Hash  Function 

^$1,000,000 

SHA1 

1016 

MD5 

9.1  x  1015 

BCRYPT  (L12) 

5.2  x  1010 

Table  7.5:  ^$1,000,000 


7.6  Associative  Memory  and  Sufficient  Rehearsal  As¬ 
sumptions 

The  expanding  rehearsal  assumption  makes  empirical  predictions  about  long 
term  memory  retention  (e.g.,  a  user  who  follows  a  rehearsal  schedule  for  a  cue- 
association  pair  will  retain  that  memory  for  many  years).  Empirical  studies  of 
human  memory  are  often  limited  in  duration  due  to  practical  constraints. 

The  most  relevant  long  term  memory  study  was  conducted  by  Wozniak  and 
Gorzelanczyk  [160].  They  supervised  a  group  of  7  people  who  learned  35,000 
Polish-English  word  pairs  over  18  months.  Their  goal  was  to  optimize  the  intervals 
between  rehearsal  of  each  word  pair.  They  ended  up  with  the  following  recursive 
formula 

I(EF,  R)  =  I(EF,  R-  1)  x  OF(EF,  R) , 

where  KEF,  R)  denotes  the  time  interval  before  the  R' th  rehearsal,  EF  denotes  the 
easiness  factor  of  the  particular  word  pair,  and  OF(EF,R)  is  a  coefficient  matrix 
which  specifies  how  quickly  the  intervals  grow  2.  The  intervals  are  very  similar  to 
those  generated  by  the  expanding  rehearsal  assumption.  Our  association  strength 
parameter  a  is  similar  to  the  easiness  factor  EF.  However,  in  the  expanding 
rehearsal  assumption  OF(EF,  R)  would  be  a  constant  that  does  not  vary  with  R. 

Squire  tested  very  long  term  memory  retention  by  conducting  a  series  of  stud¬ 
ies  over  30  years  [144],  To  conduct  his  studies  Squire  selected  a  TV  show  that 
was  canceled  after  one  season,  and  quizzed  participants  about  the  show.  It  was 
not  surprising  that  participants  in  the  early  studies  —  conducted  right  after  the 
show  was  canceled  —  had  the  best  performance  on  the  quizzes.  However,  after 
a  couple  of  years  performance  dropped  to  a  stable  asymptote  [144],  The  fact  that 
participants  were  able  to  remember  some  details  about  the  show  after  30  years 

2SuperMemo,  a  popular  commercial  memory  program  http :  / /www .  supermemo .  com/,  also  uses 
a  similar  rehearsal  schedule. 
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suggests  that  it  is  possible  to  maintain  a  cue-association  pair  in  memory  with¬ 
out  satisfying  all  of  the  rehearsal  requirements  given  by  our  pessimistic  constant 
rehearsal  assumption. 


7.6.1  Squared  Rehearsal  Assumption 

Anderson  and  Schooler  demonstrated  that  the  availability  of  a  memory  is  corre¬ 
lated  with  recency  and  the  pattern  of  previous  exposures  (rehearsals)  to  the  item 
[18].  Eventually,  the  following  equation  was  proposed 


A(0  =  X 

7=1 


1 


where  A,  (t)  denotes  the  availability  of  item  i  in  memory  at  time  t  and  t]f  .  .  .  t„  <  t 
denote  the  previous  exposures  to  item  i  [151].  In  this  model  the  rehearsal  schedule 
R  (c,  j)  =  j2  is  sufficient  to  maintain  high  availability  To  see  this  consider  an 
arbitrary  time  t  and  let  k  be  the  integer  such  that  (k2  <  t  <  (k  +  l)2).  Because 
4  =  k2  <  t  at  least  k  previous  rehearsals  have  occurred  by  time  t  so 


Squared  Rehearsal  Assumption  (SQ):  The  rehearsal  schedule  givenby  R  (c,  i)  = 
i2o  is  sufficient  to  maintain  the  association  (6,8). 


While  SQ  is  certainly  not  equivalent  to  ER  it  is  worth  noting  that  our  general 
conclusions  are  the  same  under  both  memory  assumptions.  The  rehearsal  inter¬ 
vals  grow  with  time  under  both  memory  assumptions  yielding  similar  usability 
predictions  —  compare  Tables  2.2,23  and  7.6.  The  usability  predictions  are  still 
that  (1)  Strong  Random  and  Independent  — though  highly  secure  —  requires  any 
user  with  infrequently  visited  accounts  to  spend  a  lot  of  extra  time  rehearsing 
passwords,  (2)  Lifehacker  requires  little  effort  —  but  it  is  highly  insecure,  (3)  SC- 
0,  which  is  almost  as  good  as  Lifehacker  from  a  usability  standpoint,  provides  the 
user  with  some  provable  security  guarantees,  and  (4)  SC-1  and  SC-2  are  reasonably 
easy  to  use  (except  for  the  Infrequent  user)  and  provide  strong  provable  security 
guarantees  —  though  not  as  strong  as  Strong  Random  and  Independent. 
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Schedule/Scheme 

B+D 

SC-0 

SC-1 

SC-2 

SRI 

Very  Active 

*  0 

*  0 

2.77 

5.88 

794.7 

Typical 

*  0 

«  0 

7.086 

12.74 

882.8 

Occasional 

*  0 

«  0 

8.86 

16.03 

719.02 

Infrequent 

.188 

2.08 

71.42 

125.24 

1176.4 

Table  7.6:  IE  [XR365]:  Extra  Rehearsals  over  the  first  year  under  the  Squared  Re¬ 
hearsal  Assumption  —  o  —  1. 

B+D:  Lifehacker 

SRI:  Strong  Random  and  Independent 


While  the  expanding  rehearsal  assumption  yields  fewer  rehearsal  requirements 
over  the  first  year,  the  usability  results  for  Lifehacker  and  Shared  Cues  are  even 
stronger  because  the  intervals  initially  grow  faster.  The  usability  results  are  worse 
for  Strong  Random  and  Independent  because  many  of  the  cues  are  naturally  rehearsed 
with  frequency  A  =  1  /365  —  in  this  case  most  rehearsal  requirement  will  require 
an  extra  rehearsal3. 


7.7  (n,  t,  y)-sharing  Set  Families 

Our  notion  of  (n,  L,  y)-sharing  set  families  (definition  5)  is  equivalent  to  Nisan  and 
Wigderson's  definition  of  a  (k,  m) -design  [115].  Nisan  and  Wigderson  provided 
several  constructions  of  ( k ,  m)-designs.  For  example,  one  of  their  constructions 
implies  that  their  is  a  (n,  t,  y)-sharing  set  family  of  size  m  =  for  n  =  Lc\[f\  and 
y  =  r~1,  whenever  f  is  a  prime  power.  While  this  construction  is  useful  for  building 
pseudorandom  bit  generators,  it  is  not  especially  helpful  in  the  password  context 
because  i  should  be  a  small  constant.  For  example,  if  we  set  L  =  4  and  the  user 
needs  to  create  m  =  64  accounts  then  we  would  need  to  set  t  =  logL  m  =  3.  This  a 
(128,4,  l)-sharing  set  family  of  size  m  =  64  or  a  (32, 4, 2)-sharing  set  family  of  size 
m  =  64.  By  contrast,  we  can  construct  a  (43,4,  l)-sharing  set  family  of  size  m  >  64 
as  well  as  a  (23, 4,  l)-sharing  set  family  of  size  m  >  64  (Observe  that  23  =  3+5  +  7+8 
and  3  X  5  X  7  >  64  so  we  can  apply  the  Chinese  Remainder  Theorem  construction 
from  Chapter  2.6).  In  Section  7.7.1  we  show  that  the  Chinese  Remainder  Theorem 

3The  usability  results  for  our  occasional  user  are  better  than  the  very  active  user  because  the 
occasional  user  has  fewer  sites  that  a  visited  with  frequency  A  =  1/365. 
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(n,£,y)- 

sharing 

Lower  Bound 
(m) 

Upper  Bound 
(Thm  3) 

Comment 

(n,£,£  -  1) 

O 

0 

Claim  8 

(9,4,3) 

126 

126 

Greedy  Construction  (Alg  7.5) 

(16,4,1) 

16 

20 

Greedy  Construction  (Alg  7.5) 

(20,6,2) 

40 

57 

Greedy  Construction  (Alg  7.5) 

(25,6,2) 

77 

153 

Greedy  Construction  (Alg  7.5) 

(18,6,3) 

88 

204 

Greedy  Construction  (Alg  7.5) 

(19,6,3) 

118 

258 

Greedy  Construction  (Alg  7.5) 

(30,9,3) 

36 

217 

Greedy  Construction  (Alg  7.5) 

(40,8,2) 

52 

176 

Greedy  Construction  (Alg  7.5) 

(43,4,1) 

110 

150 

Theorem  22 

Table  7.7:  ( n ,  £,  y)-sharing  set  family  constructions 


construction  from  Section  2.6  can  be  improved  slightly.  In  Section  7.7.2  we  show 
that  our  construction  of  ( n ,  (',  y)-sharing  may  have  applications  to  the  construction 
of  highly  parallelizable  pseudorandom  generators.  Section  7.7.2  is  based  on  work 
of  Beideman  and  Blocki  [25]. 

7.7.1  Improved  Constructions 

In  this  section  we  discuss  additional  (n,£,y)- sharing  set  family  constructions. 
Theorem  22  demonstrates  how  our  Chinese  Remainder  Theorem  construction  can 
be  improved  slightly.  For  example,  we  can  get  a  (43, 4,  l)-sharing  set  family  of  size 
m  =  110  with  the  additional  optimizations  from  Theorem  22  —  compared  with 
m  =  90  without  the  optimizations.  We  also  use  a  greedy  algorithm  to  construct 
(ft,  £,  y)-sharing  set  families  for  smaller  values  of  ft.  Our  results  our  summarized 
in  Table  7.7  —  we  also  include  the  theoretical  upper  bound  from  Theorem  3  for 
comparison. 

Theorem  22.  Suppose  that  ni  <  ...  <  tif  are  pairwise  co-prime  and  that  for  each  1  <  i  <  £ 
there  is  a  ( rii,£,y)-sharing  set  family  of  size  m,.  Then  there  is  a  (X[=i  ni,£,y)-  sharing  set 
family  of  size  m  =  n^1  ni  +  ZLi  m<- 
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Algorithm  7.5  Greedy  Construction 

Input:  n,£,y 

All  Subsets:  S'  <—  {S  c  [n\  |  |S|  =  £) 

Candidates:  S  <—  0 
for  all  S  £  S'  do 
okToAdd  <—  True 
for  all  T  £  S  do 

if  |T  P|  S|  >  y  then 
okToAdd  <—  False 
if  okToAdd  then 
S^<Su{S} 

return  5 

Proof.  We  can  use  Algorithm  2.1  to  construct  a  (n,  f,  y)-sharing  set  family  <So  of 
size  m'  =  njlx  ni-  Let  Tx  =  {fc  | k  <  n.\}  and  for  each  i  >  1  let  T,  =  |/c  +  Jffjy  nj  k  <  n,j. 
By  construction  of  So  it  follows  that  for  each  S  £  So  and  each  1  <i<i  we  have 
|S  P|  T,|  =  1.  By  assumption,  for  each  i  >  1  there  is  a  (n,£,y)- sharing  family  of 
subsets  of  Tj  of  size  m,  —  denoted  S{.  For  each  pair  S'  £  Si,  and  S  £  <S0  we  have 

\S(~)S'\  <  |sP|T/|  <1, 
and  for  each  pair  S'  £  Si,  and  S  £  Si  (S  ^  S') 

because  Si  is  (nt,  £,  y)-sharing.  Finally,  for  each  pair  S'  £  Si,  and  S  £  Sj  (j  ±  i)  we 
have 

|sf|S'|  <  |sp|T,-|  <0. 

Therefore, 

e 

s  =  [Js,, 

i= 0 

is  a  (Lf=i  nu  y)-sharing  set  family  of  size  m  =  n|=1  ni  +  Zf=i  mi-  D 

Claim  8.  For  any  0  <  C  <  n  there  is  a  (n,  £,£  -  l)-sharing  set  family  of  size  m  =  ("),  and 
there  is  no  (n,  £,£  -  l)-sharing  set  family  of  size  rn'  >  m. 
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Proof.  It  is  easy  to  verify  that 


<S  =  {S  c  [71]  |  |S|  =  £}  , 

the  set  of  all  subsets  of  size  £,  is  a  (n,£,£  -  l)-sharing  set  family  of  size  m  = 
Optimality  follows  immediately  by  setting  y  —  £  —  1  in  Theorem  3.  □ 

7.7.2  Applications  to  Pseudorandom  Number  Generators 

Applications  to  Pseudorandom  Number  Generation  A  pseudorandom  number 
generator  is  a  function  G  :  {0,1}”  — >  {0,1}”'  which  takes  a  uniformly  random 
seed  x  ~  {0,1}"  of  length  n,  and  outputs  a  string  G(x)  e  {0,1}'"  (m  »  n )  which 
"looks  random."  Nisan  and  Wigderson  used  a  (n,  £  =  O  (  sfnj ,  y  =  log  mj-sharing 
set  family  S  =  {Si, . . . ,  Smj  of  size  m  to  construct  pseudorandom  number  generators 
[115].  In  particular,  they  define  the  pseudorandom  number  generator  NWP/s  (x)  = 
P(x|Sl)...P(x|SJ,  where  x\st  €  {0, l}f  denotes  the  bits  of  x  £  {0, l}r  at  the  indices 
specified  by  S,  and  P  :  {0, \)e  — »  {0, 1}  is  a  predicate.  If  the  predicate  P  :  {0,  l}f  — > 
{0, 1}  is  "hard"  for  circuits  of  size  Hf  (P)  to  predict 4  then  no  circuit  of  size  Hf  (P)  - 
O  (m27)  will  be  able  to  distinguish  NWp^  (x)  from  a  truly  random  binary  string  of 
length  m,  when  the  seed  x  ~  {0, 1}"  is  chosen  uniformly  at  random.  In  this  context, 
n  is  the  length  of  the  random  seed,  m  is  the  number  of  random  bits  extracted  and 
the  pseudorandom  number  generator  fools  circuits  of  size  Hf  (P)  -  O  (m27).  Thus, 
we  would  like  to  find  (n,  £,  y)-sharing  set  families  where  n  is  small,  m  is  large  (e.g., 
we  can  extract  many  pseudorandom  bits  from  a  small  seed)  and  y  is  small  (e.g.,  so 
that  the  pseudorandom  bits  look  random  to  a  large  circuit).  Nisan  and  Wigderson 
gave  an  explicit  construction  of  an  (£2,  £,  y)-sharing  set  family  of  size  £y+1. 


Applications  to  Randomness  Extractors  Trevisan  used  the  pseudorandom  num¬ 
ber  generator  of  Nisan  and  Wigderson  to  construct  a  randomness  extractor  [149]. 
A  ( k,e )  randomness  extractor  is  a  function  Ext :  {0,  l}f  X  {0, 1}"  — >  {0, 1}”'  that  takes 
a  string  X\  ~  D,  where  D  is  a  distribution  over  {0,1 }  ^  with  minimum  entropy 
k,  along  with  a  n  additional  uniformly  random  bits  x^  ~  {0,1}"  and  extracts  an 
m-bit  string  y  £  {0,1}'"  that  is  almost  uniformly  random  (e.g.,  distribution  over 
y  £  {0, 1}”'  is  e-close  to  the  uniform  distribution  Um  over  {0, 1}'").  Trevisan  used  the 
string  Xi  to  select  a  random  predicate  P  :  {0,  \}c  — »  {0, 1},  and  then  extracted  m  bits 

4Nisan  and  Wigderson  observe  that  a  random  predicate  P  will  satisfy  this  property  with  high 
probability[115]. 
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by  running  NWP/i§  (x2).  Raz  et  al  [126]  observed  that  the  pseudorandom  number 
generator  Nisan  and  Wigderson  could  be  built  using  a  weak  ( n ,  t,  y)-sharing  set 
family  of  size  m,  and  showed  how  to  construct  weak  (n,£,y)- sharing  set  family 
of  size  m  for  any  value  of  m  as  long  as  n  >  ]  L  However,  their  construction 

was  not  explicit  (Informally,  we  say  that  a  construction  is  explicit  if  their  is  a  fast 
parallel  algorithm  to  output  the  z'th  set.).  Hartman  and  Raz[89]  showed  how  to 
use  the  Nisan- Wigderson  construction  to  obtain  an  explicit  construction  of  weak 
(n,  £,  y)-sharing  set  families.  While  their  construction  requires  less  space  than  our 
construction,  our  construction  can  be  computed  faster  on  a  parallel  machine. 


Advantages  of  Explicit  Constructions  One  nice  property  of  the  Nisan  Wigderson 
Pseudorandom  number  generator  is  that  it  is  highly  parallelizable.  For  each  j  £  [m\ 
we  can  compute  the  /'th  bit  NWP/l§  (x)  [j]  =  P  (*|s,)  independently  as  long  as  we  can 
quickly  find  the  set  Sj  £  S.  Observe  that  we  would  need  space  at  least  O  {mi  log  n ) 
to  store  the  set  family  S  =  {Si, . . . ,  Sm],  which  could  be  a  problem  especially  when 
m  is  very  large.  However,  if  the  set  family  has  an  explicit  construction  (e.g.,  there 
is  a  small  circuit  C  s.t.  C  (z)  =  St  for  all  i  £  [ m ])  then  we  can  simply  compute 
NWP/iS  (x)  [j]  =  P  (X|C(;}). 


Preliminaries 


Before  we  formally  define  a  pseudorandom  number  generator  we  first  define  a 
pseudorandom  distribution  X  over  {0,1}'".  Informally,  Definition  19  say  that  a 
distribution  is  pseudorandom  if  the  distribution  that  'appears'  random  to  any 
'small  enough'  circuit.  Given  a  circuit  C  we  use 


Advc(X) 


Pr  [C(x)  =  1]  -  Prxeum[C(x)  =  1] 


to  denote  the  advantage  of  C  at  predicting  whether  x  was  drawn  from  the  dis¬ 
tribution  X  or  from  Um,  where  Um  is  the  uniform  distribution  over  {0,1}”'.  The 
distribution  X  'appears'  random  to  a  circuit  C  if  Advc  (X)  is  small. 

Definition  19.  A  distribution  X  over  {0,1}'"  is  said  to  be  (s,  e)-pseudorandom  if,  given 
any  circuit  C  (taking  m  inputs)  of  size  at  most  s,  Advc  (X)  <  e. 


Given  a  distribution  X  over  {0,1}"  and  a  function  G  :  {0,1}"  — »  {0,1}”'  we  use 
G(X)  to  denote  the  distribution  over  {0,1}'"  induced  by  G.  Informally,  a  function 
G  :  {0, 1}"  — »  {0, 1}'"  is  pseudorandom  if  it  induces  a  pseudorandom  distribution. 
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Definition  20.  Let  {Gf!},!eN  be  a  family  of  functions  such  that  Gn  :  {0,1}”  — »  {0,1}'".  We 
say  the  family  is  a  ( s,e)-pseudorandom  number  generator  ifG  is  computable  in  time  2°("), 
and  G(Un )  considered  as  a  distribution  is  (s,  efpseudorandom. 

Nisan  and  Wigderson  [115]  show  how  to  construct  a  pseudorandom  number 
generator  G  :  {0, 1}"  — »  {0, 1}'"  using  any  ( n ,  L,  y)-sharing  set  family  of  size  m.  Their 
construction  assumes  the  existence  of  a  predicate  /  :  {0, 1)[  — »  {0, 1}  that  is  hard  for 
'small'  circuits  to  predict. 

Definition  21.  Let  f  :  {0,  l}r  — >  {0, 1}  be  a  boolean  function.  We  say  that  f  is  (s,  e)-hard 
if  for  any  circuit  C  of  size  s,  lP  r.r~{0,l)f  [CM  =  /M]  -  j  |  <  e. 

Observe  that  a  random  function  will  fool  all  small  circuits  with  high  probabil¬ 
ity5.  Following,  Nisan  and  Wigderson  we  use  H(f)  to  denote  the  hardness  of  a 
function  /. 

Definition  22.  Let  f  :  {0,1}*  — »  {0,1}  be  a  boolean  function  and  let  ft  be  the  restriction 
of  f  to  strings  of  length  t.  The  hardness  of  f  at  L,  Hf{6 )  is  defined  to  be  the  maximum 
integer  he  such  that  f(  is  (1  /hf,hf)  -  hard. 

Raz  et  al  [126]  showed  that  the  Nisan-Wigderson  pseudorandom  number  gen¬ 
erator  works  even  if  the  family  of  sets  Si, ...,  Sm  only  satisfies  the  weaker  condition 
from  definition  23.  Observe  that  any  (w,  L,  y)-sharing  set  family  is  also  a  weak 
(ft,  l,  y)-sharing  set  family,  but  the  converse  is  not  necessarily  true.  We  also  note 
that  as  m  increases  the  requirement  X,,<,  2 Is'  ^ s,l  <  27  (m  -  1)  becomes  increasingly 
lax.  This  allows  us  to  construct  arbitrarily  large  weak  (ft,  t,  y)-sharing  families. 

Definition  23.  A  family  of  sets  Si, ...,  Sm  c  [ft]  is  a  weak  (ft,  £,  y)-sharing  set  family  if(l) 
Vz  g  [iri\.  |S;|  =  €,  and  (2)  Vz  e  [m].YJj<il\Si^si\  <  2 r(m  -  1). 

Informally,  we  say  that  a  set  family  Si, ... ,  Sm  is  explicitly  constructive  if  their 
is  a  fast  parallel  algorithm  LA  (z)  to  compute  the  z'th  set  S,.  We  use  Depth  (A)  to 
denote  the  running  time  of  A  when  executed  in  parallel  processes  Work  (A)  to 
denote  the  total  number  of  steps  executed  in  all  processes.  Similarly,  Space  (A) 
denotes  the  total  space  requirement  of  A.  Our  notion  of  explicit  constructions 
(Definition  24)  is  similar  to  the  notion  used  by  Hartman  and  Raz[89]  except  that 
we  also  consider  the  parallel  running  time  of  A. 

5The  argument  is  straightforward.  Fix  any  circuit  C.  A  random  function  /  :  {0, 1}'  — »  {0, 1}  will 
satisfy  Advc  (/  (Iff))  <  e  with  very  high  probability  by  Chernoff  bounds.  We  can  then  apply  union 
bounds  to  argue  that  a  random  /  will  satisfy  maxcec  Advc  (X)  <  e  for  any  sufficiently  small  class 
C  of  circuits. 
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Definition  24.  We  say  that  a  set  family  Si, ... ,  Sm  c  [n],  where  the  size  of  each  set  is  l, 
is  (ti,  t2,  t3)-explicitly  constnictible  if  there  exists  an  algorithm  l  s.t.  for  all  1  <  /'  <  m  we 
have  ffK  (i)  =  Si,  Work  <  TR,  Depth  (A)  <  t2  and  Space  (1R)  <  t3. 

Definition  25.  We  use  m  (n,  £,  y)  to  denote  the  maximum  value  m  such  that  there  exists 
a  (n,  £,  y  [-sharing  set  family. 

The  (ft,  £,  y)-sharing  set  family  construction  of  Chapter  2.6  relies  on  the  Chinese 
Remainder  Theorem.  To  analyze  their  construction  we  will  be  interested  in  finding 
a  large  set  S  =  {h, . . . ,  tf  of  integers  such  that  S  has  size  £,  the  numbers  in  S  are 
pairwise  coprime,  Yn= i  £  ^  n  and  each  f;  >  f(.  We  will  rely  on  recent  results  on 
prime  density. 

Definition  26.  n(t)  indicates  the  number  of  prime  numbers  less  than  or  equal  to  t.  nn{t) 
indicates  the  maximum  |S|  such  that  S  c  |f  f], ...,  fj  and  Vi  ^  j  e  S.GCD  (hi)  =  i. 

We  are  particularly  interested  in  lower  bounding  the  value  nn(x).  Clearly, 
titi(x)  >  7i (x)  -  7i(x/2).  As  it  turns  out  this  lower  bound  is  nearly  tight  (see  Theorem 
25).  We  can  bound  n(x)  -  ti(x/2)  using  Ramanujan  primes. 

Definition  27.  [124]  The  t'th  Ramanujan  Prime  is  the  smallest  integer  Rt  s.t.  n(x)  - 
n(x/2)  >  tfor  all  x  >  Rt. 

Allowing  ft  to  equal  at  least  £R(  guarantees  that  j^,  ||  contains  at  least  £  primes 
which  will  satisfy  the  conditions  of  the  Chinese  Remainder  Theorem  construction. 
Sondow's  bounds  on  Ramanujan  primes  (see  Theorem  23)  allow  us  to  express  this 
bound  on  n  as  an  elementary  function. 


Constructions 


Nisan  and  Wigderson  [115]  gave  an  explicit  construction  of  (£2,£,y)- sharing  set 
families  of  size  m  =  A/+1  for  any  prime  power  t.  Given  a  polynomial  p(x)  with  coeffi¬ 
cients  in  GF(Z’),  the  finite  field  of  size/)  they  define  the  set  Sp  =  {(x,p(x))  x  e  GF  (/')}. 
The  family  S  =  |Sf,  p  has  degree  <  y}  is  ( £ 2,  £,  y)-sharing  and  has  size  m  =  \S\  = 
G+1 .  Given  pairwise  coprime  numbers  «!<...<  we  provided  an  explicit  con¬ 
struction  of  (Lf=i  ni,£>y)- sharing  families  in  Chapter  2.  Briefly,  given  an  integer 
i  >  0  we  defined  the  set  S,  =  [1  +  Uk  +  (z  m°d  w;)  :  j  e  [£\\,  and  showed  that 
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the  family  S 

ny+1 

U  n /• 


S,  0  <  z  <  npi'  w;j  is  an  (E[=i «/,  y)-sharing  set  family  of  size 


The  proof  of  Theorem  24  is  based  on  Theorem  22.  We  take  advantage  of 
Sondow's  results  on  prime  density  [142]  to  compare  our  construction  to  the  con¬ 
struction  of  Nisan  and  Wigderson. 


Theorem  23.  [142  ]  For  all  t  >  1  the  following  bound  holds  It  In  t  <  Rt  <  4 1  In  4f. 

Theorem  24.  Vn  >  4£2  In  4£,  m{n,£,y)  >  (2£  In  2£)y+] .  Furthermore,  this  set  family  is 
(fi,f2,  h)-explicitly  constructible  with  t\  =  O  {£  (login)  (\og£)),t2  =  O  (log log m)  and 
h  =  O  if  (log  m)  (log/’)). 


Proof.  Theorem  23  due  to  Sondow  [142]  shows  that  there  will  always  be  at  least 
£  primes  p\,...  ,pt  between  2£\n2£  and  4£  In  4£.  We  have  Ef=i  Vi  —  £(4£ln4£)  <  n. 
Note  that  n^1  pt  >  {21  In  2C)y+l .  It  follows  from  Theorem  22  that  m  (n,£,y)  > 
{2£  In  2£)y+\ 

To  construct  this  set  family  our  algorithm  will  store  the  following  values  (1) 
the  primes  p\,...,p{,  (2)  the  precomputed  values  2k  mod  pj  for  each  0  <  k  < 

min  {hi  ~  1/  Llog  mjJ  and  1  <]<*,  and  (3)  the  precomputed  values  kpj  for  each 
1  <  k  <  log  log  log  m  and  each  1  <  )  <£.  We  need  O  {£  log  £)  bits  of  space  to  store 
each  of  the  primes,  O  (£  (log  £)  (log  m ))  bits  of  space  to  store  each  of  the  the  values 
2k  mod  pj  and  O  {£  (log  £)  (log  log  log  m))  bits  of  space  to  store  each  of  the  values 
kpj.  Given  an  index  /0  =  i  <  m  our  algorithm  34  computes  each  of  the  elements 
of  Sj  in  parallel  —  to  compute  S,[j]  it  suffices  to  compute  the  value  i0  mod  pj. 
We  will  focus  on  the  computation  of  z0  mod  pj.  For  each  k  <  Llog  mj  we  look  up 
the  precomputed  values  2k  mod  pj  and  compute  y^  =  io[k]2k  mod  py  (here,  i0[k] 
denotes  the  k' th  bit  of  z’o  when  z'o  is  viewed  as  a  binary  string).  We  then  compute 
the  value  i\  =  (observe  that  i\  =  z0  mod  pf.  This  can  be  done  by  a 

depth  O  (log  log  m )  circuit  by  using  two  tricks:  divide-and-conquer  and  carry-save 
addition.  We  group  ylf . . . ,  y\\0gm\  into  triples  (e.g.,  (i/i,  y2,  yf),  •  •  • ,)  for  each  triple  we 
compute  the  partial  sum  ps  (the  bits  of  ps  are  defined  as  ps[f]  =  i/i[t\  ©  i/2[f]  ©  y3[f] 
for  each  index  t )  and  the  shift-carry  sc  (sc[f  +  1]  =  (yi [f]  A  y2[f])  V  (yi [f]  A  y3[f])  V 
(y2[f]  A  y3[f])).  Observe  that  y\  +  y2  +  y3  =  ps  +  sc.  We  went  from  log  m  values  that 
we  needed  to  add  to  |  log  m  values  that  we  needed  to  add  —  after  O  (log  log  m ) 
rounds  we  will  have  the  value  q  =  i0  mod  p  r  While  q  may  not  be  the  final  answer 
(e.g.,  we  could  have  q  >  pj )  we  have  q  <  pj  log  m  «:  i0  =  0(m )  so  we  are  making 
progress.  Now  we  can  recursively  compute  a  value  i2  s.t.  i2  =  q  mod  pj  and 
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i2  <  Pj  log  (1  +  log  log  m)  —  this  time  we  are  only  adding  O  (log  log  m)  numbers  so 
we  need  O  (log  log  log  m)  rounds  of  computation  to  find  i2.  Now,  we  can  search 
for  the  biggest  value  kpj  s.t.  i2  >  kpj  (note  that  there  are  only  log  log  log  m  values 
to  check).  Our  final  answer  is  simply  i2  -  kpj  =  Zq  mod  p , .  □ 

Note  that  our  construction  only  requires  relatively  prime  numbers.  So  the  re¬ 
sults  from  Theorem  24  could  be  improved  by  including  non-prime  values.  How¬ 
ever,  Theorem  25  implies  that  these  improvements  will  not  be  particularly  signifi¬ 
cant. 

Theorem  25.  Vn  G  Z+.  nn(n)  <  n(n)  -  n('j)  +  7i(  \fn). 

Proof.  Let  S  c  . .  ,n J  be  a  set  of  coprime  numbers  of  maximum  size.  Observe 

that  each  prime  number  p  G  [n\  is  a  factor  of  at  most  one  number  in  S.  Without  loss 
of  generality  we  can  assume  that  each  of  the  primes  between  n  and  |  are  contained 
in  S  (if  p  £  S  then,  because  S  is  of  maximum  size,  we  must  have  some  t  =  pq  G  S, 
but  in  this  case  we  can  simply  replace  t  with  p).  The  number  of  primes  between 
n  and  |  is  n(n)  -  7t(|),  and  all  of  these  integers  are  relatively  prime  to  each  other 
and  to  every  other  number  in  the  range  [n].  All  other  numbers  in  S  must  have  at 
least  two  prime  factors,  and  at  least  one  of  them  must  be  less  than  or  equal  to  yjn. 
Since  each  prime  factor  less  than  or  equal  to  yfn  can  be  used  at  most  once,  for  the 
members  of  S  to  remain  pairwise  relatively  prime,  at  most  n(  yfn)  non-primes  can 
be  included  in  the  set,  each  containing  a  single  prime  factor  less  that  yfn.  □ 

Comparing  Constructions.  To  compare  our  construction  from  Chapter  2  with 
the  construction  of  Nisan  and  Wigderson  [115]  we  set  n  =  4£'2  In  4£'  and  we  set 
£  =  V4A2  In  4t.  The  construction  of  Nisan  and  Wigderson  gives  use  m  (n,  £,  y)  > 
£y+]  =  {it  Vln4A)J  ,  while  our  construction  gives  us  m  (n,  £' ,  y)  >  (It  In  2 t)y+1  > 
(it  Vln4 t)r+\  However,  t  <  £  so  our  construction  has  a  smaller  £. 


Constructing  Weak  (n,  £,  y)-sharing  set  families.  We  now  show  that  our  explicit 
Chinese  Remainder  Theorem  set  family  construction  can  be  also  be  used  to  con¬ 
struct  weak  (n,  £,  y)-sharing  set  families  of  arbitrary  size  m.  Our  main  results  are 
stated  in  Theorem  26. 

Theorem  26.  For  all  m  there  is  an  explicitly  constructive  weak  (4f2  In  At,  £,  y)-sharing 
set  family  of  size  m  as  long  as  1)'  >  (l  +  — [TiTv)-  Furthermore,  this  set  family  is 
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(fi,  t2,  t3)-explicitly  constructible  with  t\  -  0(£  (log  m )  (log  £)),t2  =  O  (log  log  m )  and 
h  =  O  (/  (log m)  (log/’)). 


Proof.  Let  m  be  given.  We  use  the  explicit  construction  from  Chapter  2.6.  By 
Theorem  23  we  can  find  £  primes  such  that  2/ In  2/  <  p3  <  . . .  <  pc  <  A£  In  A£.  In 
particular,  we  let  S,  =  {l  +  pk  +  (i  mod  p^j  j  e  [/]}.  Now  for  i  e  [ m ]  we  have 


OO  CO 

E  2|s,ns'1  =  Yj2t\\i\i<iA  Is'  n  s'l  =  ,cl| £  E  2‘ \\i  \i<iA  ls< n  s-l a  k\ 


<  >  2‘ 


n  '-1  <  y  2‘ 

k)n  Uvrh 


£\  i  - 1 


k)  (2£\n2£)k 


Zi  -  i  , .  ..l  In 2£ 

(]n2£)k~  1  (-1  +  In  2/ 


<  (m  -  1)2}/ , 


where  the  second  inequality  follows  from  the  Chinese  Remainer  Theorem.  We  al¬ 
ready  showed  that  this  set  family  is  explicitly  constructible  in  the  proof  of  Theorem 


24. 


□ 


Raz  et  al  gave  a  randomized  construction  of  weak  (  k  ■  £,  £,  y^-sharing  set 
families  for  any  m,y  >  0.  While  they  showed  that  their  construction  could  be  de- 
randomized,  their  construction  is  not  explicit  (e.g.,  the  construction  of  z'th  subset 
Si  is  dependent  on  the  sets  Si, ... ,  S;_ i).  Theorem  26  shows  that  our  construction  is 
competitive  with  the  construction  of  Raz  et  al  [126]  though  the  value  of  n  is  slightly 
larger.  Hartman  and  Raz  [89]  later  showed  how  to  use  the  Nisan-Wigderson  con¬ 
struction  to  get  an  explicit  construction  weak  ( £2,£ ,  0(l))-sharing  set  families  with 
t\  -  t2  -  O  (poly  (£, login))  and  t3  =  O  (login).  Their  construction  requires  less 
space  than  ours  O  (log  in)  vs  O  (£  (log  in)  (log  £))  space,  but  our  construction  will 
run  faster  on  a  parallel  computer  O  (log  log  m)  vs  O  (poly  (£,  log  in))  time.  The 
construction  of  Hartman  and  Raz[89]  could  be  optimized  to  run  in  parallel  time 
t2  =  O  (log  log  in)  by  precomputing  a  few  strategic  values  (e.g.,  an  £  x  £  multipli¬ 
cation  table  and  an  /  X  log  in  exponentiation  table),  but  then  we  would  require 
O  (£2  log  /)  space  to  store  all  of  the  precomputed  values. 


Parallel  Pseudorandom  Number  Generators.  Nisan  and  Wigderson  proved  that 
if  y  =  log  m,  S  is  a  (n,  £,  y)-sharing  set  family  and  Hf(£)  >  2 m1  that  their  construc¬ 
tion  NW^  is  a  (in2,  f)  pseudorandom  number  generator.  In  particular.  Theorem 
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27  implies  that  if  D  is  a  circuit  of  size  |D|  <  m2  that  distinguishes  NW^s  (H„)  from 
U,„  with  advantage  ADV;)  (NWy  ( L/„))  >  f  then  there  exists  a  circuit  C  of  size 
|C|  <  2 m2  which  predicts  f(x)  with  advantage  ADVC  (f  (Uf))  >  This  contra¬ 
dicts  the  definition  of  Hf  (£).  Raz  et  al  [126]  observed  that  it  suffices  for  S  to  be 
a  weak  (n,  £,  y  {-sharing  set  family.  If  we  let  Sm  denote  the  explicitly  constructible 
weak  (A£2  In  A£,  £,  y )-sharing  set  family  of  size  m  then  for  any  m  >  0  N Wf,sm  is 
a  (m2,  Tj  pseudorandom  number  generator  with  seed  length  AC1  In  At  assuming 
that  Hf  {£)  >  2m2.  Because  Sm  is  explicitly  constructible  we  can  compute  each  bit 
NW/a  (x)  [z]  =  /  (X|S;)  independently 

Theorem  27.  [115,  126]  Let  f  :  {0,  l}f  — >  {0,1}  be  a  boolean  function  and  S  = 
{Si,...,Sm}  be  an  weak  (n,  £,  y)-sharing  set  family.  Suppose  D  :  [0,  l}m  — >  {0,1}  is 
such  that  ADV D  (NW f ^  (Unf)  >  e,  then  there  exists  a  circuit  C  of  size  |C|  <  \D\  + 
o(max;e[m]  YJ1<j2\s‘r]S,\nfj  such  that  |Pr^{0#i}«  [C(x)  =  f(x)]  -\\>^ 


7.7.3  Upper  Bounds 


Our  main  result  in  this  section  is  Theorem  28.  We  prove  that  m{n,€,y )  =  C\ 
whenever  t  =  f  and  y  =  cyi  provided  that  C2  is  sufficiently  small.  We  previously 

( +1) 

showed  that  m(n,  €,  y)  <  We  note  that  this  bound  is  far  from  tight  whenever  t 

\y+l/ 


is  large.  For  example,  if  C\  —  2  and  C2  =  -k  then  this  upper  bound  (nm)l(Jw)  grows 

lu  10  '  10 

exponentially  with  n.  By  contrast.  Theorem  28  implies  that  in  (n,n/2,  n/10)  =  2. 


Theorem  28.  VO  <  c2  <  l,n,Ci  e  N  such  that  C\\n.  m(n,  ^ ,c2n )  =  c1  iff  c2  < 


Before  we  prove  Theorem  28  we  first  prove  an  easier  result.  Theorem  29  upper 
bounds  lim^oo  m(n,  L,  y)  when  £  is  in  a  constant  ratio  to  n  and  y  is  small.  Theorem 
29  holds  because  the  k' th  set  Sk  must  use  cn  —  (k  —  1  )y  new  elements  (elements  that 
are  not  in  Ijf!]  S,). 

Theorem  29.  V  yc/  0  <  c  <  1  such  that  cn  e  N.  m(n,cn,yc)  — >  as  n  — >  00. 


Proof.  Let  £  =  cn  and  let  t  £  N  be  an  integer  such  that  t  >  [lj.  The  first  set  will 
contain  £  elements.  The  second  set  can  share  at  most  y  of  them,  so  the  second  set 
must  contain  at  least  £  -  y  previously  unused  elements.  Therefore  the  union  of  the 


189 


first  two  sets  must  contain  at  least  2£  -  y  elements.  In  a  similar  manner,  the  kth  set 
must  contain  at  least  £  -  (k  -  1  )y  new  elements,  therefore. 


(k  -  1  )ky 
k£-~ - —  < 


U* 

1=1 


<  n . 


(7.1) 


Assume  for  contradiction  that  lim  supj;^TC  m(n,  cn,  yc)  =  z.Then  we  have 


lim  n  —  t£  + 

n—>oo 


(Jc-l)Jc/ 


=  lim  n  -  t cn  + 

n^oo  \ 

=  lim  (n  (1  -  ct)) 

tt— >oo 

—  —  oo  . 


(k-l)ky^ 


This  contradicts  equation  7.1. 


□ 


The  proof  of  Theorem  28  is  a  bit  longer.  Proof  of  Theorem  28.  Suppose  that  for 
some  valid  n,  C\,  Co  there  is  an  (n,  £,  y)-sharing  set  family  of  size  C\  +  1.  By  equation 
7.1,  the  number  of  elements  used  by  such  a  set  family  must  be  at  least: 


(ci  +  T)£  — 


cj(ci  +  1  )y 
2 


<  n 


(7.2) 


Taking  advantage  of  the  fact  that  £  =  f  and  y 


tied: 


n  +  £  ■ 


ci(c!  +  1  )y 


< 


£  < 


n 


< 


Cl 

2  n  < 


c ^  +  c ^ 

T  Li 


< 


=  c2n,  the  inequality  can  be  simpli- 


n 

ci(ci  +  1  )y 
2 

ci(ci  +  l)c2n 
2 

(c\  +  cl)c2n 
c2 . 


Thus,  all  set  families  of  size  C\  +  1  or  greater  must  have  c2  >  and  c2  <  3^ 

ci  +ci  ci  +ci 

guarantees  the  set  family  will  have  a  size  of  at  most  C\. 

Since  C\£  =  n,  it  is  possible  to  make  a  family  of  size  C\  for  any  value  of  c2  by 
simply  choosing  sets  that  share  no  elements.  Therefore,  the  size  of  the  largest 
possible  set  family  for  any  n,  £,  y  meeting  the  specified  conditions  is  C\  if  c2  <  yyy- 
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If  C2  >  3^-j,  there  will  always  exist  a  set  family  of  size  >  C\  +  1.  To  create  such 

ci+ci 

a  family,  choose  Ci  +  1  sets  such  that  each  of  them  shares  y  elements  with  each  of 
the  others.  This  will  be  possible  as  long  as: 


cyy 

< 

e 

ncic2 

< 

n 

Cl 

c\c2 

< 

1 

< 

1  . 

,3  ,  „2 

Since  this  final  inequality  is  true  for  all  possible  values  of  C\,  it  will  such  a  set  family 
can  always  be  created,  and  its  size  will  be,  as  shown  earlier,  n  when  C2  =  .  Since 

increasing  C2  will  not  eliminate  any  possible  set  families,  no  n,  €,  y  satisfying  the 
conditions  with  C2  >  -3^-5  will  have  a  maximum  family  size  <  C\  +  1.  Therefore,  the 

ci+ci 

size  of  the  largest  possible  set  family  for  a  valid  n,  t,  y  will  be  C\  iff  C2  <  ~r—2-  □ 

c  i  +ci 

We  now  show  that  the  upper  bound  from  Theorem  28  is  nearly  tight.  In  partic¬ 
ular,  when  y  =  C2I1  for  a  slightly  larger  constant  C2  then  m(n,  £,  y)  is  exponentially 
large.  Theorem  30  lower  bounds  the  values  of  C2  for  which  m(n,  £,  y)  is  exponen¬ 
tially  large.  In  particular,  we  demonstrate  the  existence  of  an  ( n ,  £,  y)-sharing  set 
family  of  exponential  size  by  showing  that  the  probability  of  obtaining  such  a 
set  family  through  random  selection  is  non-zero.  Our  proof  uses  the  following 
randomized  construction  of  an  (n,  €,  y)-sharing  set  family.  Independently  choose 

random  integers  rl  each  in  the  range  0  <  r,  <  c\  for  i  £  {0, . . . ,  €  -  1}  and  j  £  [m].  Let 

t-\ 

Sj  =  IJ  { /c  1  +  rl).  We  use  standard  concentration  bounds  due  to  Chernoff  [53]  to 

i= 0 

show  that  | Sj  P|  S,|  <  y  with  high  probability,  and  then  we  union  bounds  to  argue 
that  the  entire  set  family  is  (n,  £,  y)-sharing  with  non-zero  probability. 

Theorem  30.  V  C2  >  0, n,C\  £  N  such  that  C\\n.  m(n,  C2T1)  >  exp(0(n))  if  C2  >  \  +  e. 

cl  Cj 

The  proof  of  Theorem  30  is  based  on  standard  concentration  bounds  due  to 
Chernoff.  We  use  the  specific  form  from  Theorem  31.  We  demonstrate  the  existence 
of  an  (n,  £,  y)-sharing  set  family  of  exponential  size  by  showing  that  the  probability 
of  obtaining  such  a  set  family  through  random  selection  is  non-zero. 
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Theorem  31.  [53]  Let  Xx, ...  ,Xn  e  [0, 1]  be  a  sequence  of  independent  random  variables. 
Let  S  =  Yn= i  xi>  and  let  P  =  E[S].  Then  for  all  5  >  0 

Pr[S  >  p  +  bn]  <  e~2n52 . 

Proof  of  Theorem  30.  We  create  an  (n,  t,  y)-sharing  set  family  by  creating  sets  in 
the  following  manner:  Independently  choose  random  integers  rl  each  in  the  range 

0  <  r,-  <  Ci  for  j  e  [m]  and  i  e  {0, . . . , L  -  1}.  Let  Sj  =  (J  \ic\  +  r1  .  Given  two  such 

i= 0  1  l> 

sets,  Sj,  Sk  let 

[  1  :  rl  =  rk 

xi  =  \  )  [ 

|  0  :  r\  +  r) 

Then  the  number  of  elements  shared  by  Sj  and  Sk  is 


e-  i 

Sj  nSk  =  ^  Xj . 
i=  0 

Let  p  =  E  |s;  n  S/c|  =  4  denote  the  expected  number  of  shared  elements.  The 
probability  that  two  such  sets  share  more  than  y  elements,  given  C2  =  ^  +  e  is 


(- 1 

Pr[\Sj  n  Sk |  >  y]  =  Pr[J^  Xi  >  c2n\ 

i= 0 

=  Pr[ ^  Xj>  —^  +  ne \ 

ci 

<  MYj  Xi  >  p  +  en\ 


with  the  last  step  by  Theorem  31.  Thus  the  probability  that  two  randomly  selected 
sets  share  more  than  y  elements  is  at  most  e~2ne\ 

An  (n,L,y)- sharing  set  family  of  size  m  will  contain  (”!)  pairs  of  sets.  The 
probability  that  the  family  is  valid,  with  none  of  the  sets  sharing  more  than  y 
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elements  is 


Pr[3j  y  k  :  |Sy  n  Sk  >  y]  < 


<  m2e~ln£l 


2 

by  the  union  bound.  For  m  <  em  ,  this  probability  will  be  less  than  1,  meaning  there 
is  a  non-zero  chance  of  forming  a  valid  set  family  of  size  m  by  random  selection 
and  therefore  such  a  family  must  exist.  □ 

We  previously  observed  that  m{n,y  +  1  ,y)  =  ( /"1)  whenever  n  >  y  +  1.  In 
general  m  (n,  £,  y)  >  m  ( n ,  £  +  1,  y)  whenever  £  >  y  +  1. 

Claim  9.  For  all  n  >  y  we  have  m  (n,  £,  y)  >  m  (n,  £  +  1,  y)  whenever  £  >y  +  1. 


Proof.  Suppose  that  £  >  y  +  1  and  we  have  an  ( n ,  £  +  1,  y)-sharing  set  family 
Si, ... ,  Sm  c  [ n ]  of  size  m.  We  can  form  a  (n, £, y)-sharing  set  family  S', . . . ,  S'm  c  [n] 
by  picking  some  element  s,-  G  S,  setting  S'  =  S,-  -  {s;}  for  each  i  G  [m\.  Observe  that 
this  argument  does  not  apply  whenever  £  —  y  because  then  we  might  have  S'  =  S' 
for  i  y  j.  □ 


Claim  9  implies  that  whenever  n/2  >  y  +  1  we  have 

max  m  (n,  f,  y)  =  m  ( n ,  y  +  1,  y) 

■ 

and  whenever  y  >  n/2  we  have  ma Xf>ym(n,£,y)  =  m(n,y,y)  =  (").  Clearly,  the 
inequality  m(n,£,y )  >  m  (n,f,  y  +  1)  also  holds.  Both  of  these  inequalities  also 
hold  for  weak  (n,  £,  y)-sharing  set  families. 


7.7.4  Open  Questions 

We  conclude  with  some  open  questions. 

We  have  shown  that  our  explicit  construction  of  (n,  £,  y)-sharing  set  families 
can  be  used  with  the  weaker  requirements  of  Raz  et  al  [126]  to  create  weak  (n,  £,  y)- 
sharing  set  families  of  arbitrarily  large  size.  Our  analysis  uses  a  number  of  po¬ 
tentially  loose  bounds,  however,  so  it  is  possible  that  a  better  analysis  of  our 
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construction  for  weak  set  families  could  improve  our  requirements  on  the  param¬ 
eters.  Also  of  interest  is  whether  there  is  another  explicit  construction  that  would 
perform  better  than  the  Blocki  et  al  construction.  The  explicit  construction  of  Hart¬ 
man  and  Raz[89]  runs  in  sequential  time  poly  (log  in,  [)  and  space  O  (logra).  Our 
construction  runs  in  parallel  time  O  (log  log  m),  but  requires  more  space  than  the 
construction  of  Hartman  and  Raz.  Future,  work  could  explore  the  space-depth 
trade-off  in  explicit  constructions  of  weak  ( n ,  t,  v)-sharing  set  families. 


We  have  shown  that  the  value  ra  (ft,ft/ci,nc2)  is  constant  whenever  C2  <  -yA. 

ci+ci 

Furthermore,  we  showed  that  whenever  C2  >  4,  m(n,n/ci,nc2)  grows  exponen¬ 


tially.  How  does  m(n,  njc\,  nc2)  grow  whenever  Ci  e 


2 


C^+C 


2/ 

1 


? 


We  have  shown  that  nn(n)  never  exceeds  n(n)  -  7i(|)  +  n(  y/n).  We  hypothesize 
that  nn(n)  =  n(n )  -  7z(|)  +  tc(  yfn)  for  all  n  >  55.  A  simple  method  to  select  a 
maximally-sized  set  of  relatively  prime  integers  is  to  take  the  square  of  each  prime 
between  and  \fn,  and  the  product  of  the  y'th  prime  less  than  and  the  /c'th 
prime  greater  than  \fn,  for  j  from  1  to  n(  \fn)  and  k  —  j  unless  this  would  make 
the  product  less  than  |  in  which  case  k  is  chosen  to  be  the  minimum  value  greater 
than  the  previous  k  so  that  the  product  is  great  than  |.  With  the  aid  of  a  computer 
we  have  shown  this  equation  true  for  all  n  from  1  to  100,000,  except  for  51,  52,  53, 
and  54. 
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Chapter  8 


Appendix:  Human  Computable 
Passwords 
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8.1  Human  Computable  Passwords  Challenge 


Scheme  1 

Scheme  2 

n 

m 

Winner 

m 

Winner 

1000 

N/A 

500 

N/A 

100  digits 

500 

N/A 

300 

N/A 

300 

N/A 

200 

N/A 

500 

N/A 

300 

N/A 

50  digits 

300 

N/A 

150 

N/A 

150 

N/A 

100 

N/A 

300 

N/A 

150 

N/A 

30  digits 

100 

N/A 

100 

N/A 

50 

N/A 

50 

N/A 

Table  8.1:  Human  Computable  Password  Challenges 
n  —  Secret  Length 
m — #  Challenge-Response  Pairs 

While  we  provided  asymptotic  security  bounds  for  our  human  computable 
password  schemes  in  our  context  it  is  particularly  important  to  understand  the 
constant  factors.  In  our  context,  we  can  assume  that  n  <  100  so  it  would  be 
feasible  for  the  adversary  to  execute  an  attack  that  takes  time  proportional  to 
10 V”  <  io10.  We  conjecture  that  in  practice  scheme  2  is  slightly  weaker  than 
scheme  1  when  n  <  100  despite  the  fact  that  s  (/i)  <  s  (J2)  because  of  the  at¬ 
tack  described  in  remark  2.  This  attack  requires  O  ( nl+g examples,  and  the 

running  time  O  (d^poly(nfj  may  be  feasible  for  n  <  100.  To  better  understand 
the  exact  security  bounds  we  created  several  public  challenges  for  researchers  to 
break  our  human  computable  password  schemes  under  different  parameters  (see 
Table  8.1).  The  challenges  can  be  found  at  http://www.cs.cmu.edu/~jblocki/ 
HumanComputablePasswordsChallenge/challenge  .htm.  These  challenges  were 
presented  during  the  rump  session  at  ASIACRYPT  2013  [29].  For  each  challenge 
we  selected  a  random  secret  mapping  o  e  Z"0,  and  published  (1)  m  single  digit 
challenge-response  pairs  (Ci,/  (0  (Q))) , . . . ,  ( Cm/f(a  (Cm))),  where  each  clause  C; 
is  chosen  uniformly  at  random  from  Xjt,  and  (2)  20  length — 10  password  challenges 
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Ci, . . . ,  C2o  e  (X,)10.  The  goal  is  to  guess  one  of  the  secret  passwords  p,-  =  f{o  (q)) 
for  some  i  G  [20]. 


8.2  Statistical  Dimension 


Our  statistical  dimension  lower  bounds  closely  mirror  the  lower  bounds  from  [73] 
for  binary  predicates.  In  particular  Lemmas  5,6,3 ,7  and  8  are  similar  to  Lemmas  2, 
4,5,  6  and  7  from  [73]  respectively.  The  high  level  proof  strategy  is  also  very  similar. 
Because  we  are  working  with  planted  solutions  0  €  Z”  instead  of  a  £  {±1}”,  we 
need  to  use  different  Fourier  basis  functions.  We  use  the  basis  functions  Xa  where 
for  a  G  Z"  is 


Xa  (x)  =  exp 


-271 V-1  (x  •  a)  \ 
d  )  ' 


While  the  Fourier  coefficients  ba  of  a  function  b  :  Z*  — »  1R  might  include  complex 

numbers,  ParsevaTs  identity  still  applies:  Y,aezkd  |^«|  =  [M*)2]-  We  first 

consider  the  following  search  problem:  find  o'  that  is  e-correlated  with  a  given  m 
randomly  chosen  challenge  clauses  from  the  distribution  Q y  for  j  G  Z Remark 
6  explains  how  to  generalize  our  results  to  the  problem  we  are  interested  in:  find 
o'  that  is  e-correlated  with  o  given  m  randomly  chosen  challenge-response  pairs 
from  the  distribution  Q,(.  In  this  section  we  let  14  denote  the  uniform  distribution 
over  Xjt. 


Definition  28.  [73]  Given  a  clause  C  G  Xk  and  S  c  [k]  of  size  £,  we  let  C|S  G  X(  denote 
the  clause  of  variables  of  C  at  the  positions  with  indices  in  S  (e.g.,  if  C  =  (1 ,k)  and 
S  =  [1,5,4  -  2}  then  C|s  =  (1,5,4  -  2)  G  X3).  Given  a  function  h  :  Xk  — >  1R  and  a  clause 
C{  G  Xf  we  define 

h‘^JM  L  "<c>- 

1  Kl  Sc[fc],|S|  =t,CeXk,C\s  =C( 


We  first  show  that  A  (o,  h)  can  be  expressed  in  terms  of  the  Fourier  coefficients 
of  Q  as  well  as  the  functions  h(.  In  particular,  we  define  the  degree  t  function 
be :  Z]j  — >  C  as  follows 

be  (a)  =  QaJ^j  Xa  (o  (Q))  he  (Q)  . 

{  aeZkd:H(a)=f  C(eXe 


197 


Notice  that  if  Q  has  distributional  complexity  r  and  £  <  r  then  b(  (a)  =  0  because 
Qa  =  0  for  all  £t  £  Zj  s.t.  1  <  H  (a)  <  r.  This  means  that  first  r  terms  of  the  sum  in 
Lemma  5  will  be  zero. 


Lemma  5.  For  every  o  G  and  h  :  Xk  — >  1R 


Proof. 


EqJ/i]  =  ^/7(C)-Qa(C) 

^/7(C)-Q(a(C)) 


CeXk 

1 

1 


CeXk 


^/7(C)^Q«T«(a(C)) 


CeXk  aeZ* 


IXfcl 


YjQ-Tj1 h(C)Xa(0(C)) 


aeZj  CeXt 
a 

k 


Z  Q«Z/7(c)*«((7(c)) 

1  *'  (=0  aeXk.:H(a)=e  CeXk 


Observe  that  whenever  a  =  0  we  have 


Q«  =  E*~z*  [Q  (*)*«  (*)]  =  [Q  (x)]  =  y 

o  d 


QW 

dk 


l . 


xeZ” 


Therefore,  for  =  0  in  the  above  sum  we  have 


YmXa{o{C)) 

CeXk 


h  (C) 


1 


Z h  (c) 

CeXk 


EuJ/7]  . 
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Therefore, 


EQa  [h]  -  EUk  [h] 


Z  &  Z  Z  Z  h^xa(o(C)) 

1  k'  t=\  ae{0,...,d-l)k:H(a)=t  Sc[k],\S\=C  CeeX(  CeXk,qs=Cr 


1 

iXfci 

i 


'  k 

E  E  &E  Xa  (O  (Q))  E  E  h  (C) 

£=1  ae{0,...,<i-l}*:H(a)=£  QeX^  Sc[fc],|S|=f  CeXs.,C|s=Q  , 


E  E  q«E  *«  (o’  (Q))  •  (Q) 

£=1  aeZk.:H(a)=t  C(eX( 


P4I 

|Xd 


Z  Q«  Z  Xa(o(Ce))hdQ) 


t=t  aeZk:H(a)=(  C(eX( 


(  k 


EjjQjM<7> 


£=1 


□ 

The  following  lemma  is  similar  to  Lemma  4  from  [73].  Lemma  6  is  based  on  the 
general  hypercontractivity  theorem  [116,  Chapter  10]  and  applies  to  more  general 
(non-boolean)  functions. 

Lemma  6.  [116,  Theorem  10.23]  If  b  :  Z'j  — >  1R  has  degree  at  most  €  then  for  any 
t  >  ( yjlejd^  , 

Pr;[|i.Wl>«h]<lexp(-C(W), 

where  \\b\\2  =  "  \b  (x)2] 

Lemma  3  and  its  proof  are  almost  identical  to  Lemma  5  in  [73].  We  simply 
replace  their  concentration  bounds  with  the  concentration  bounds  in  Lemma  6. 
We  include  the  proof  for  completeness. 
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Reminder  of  Lemma  3.  Let  b  :  Z”  — >  IR  be  any  function  with  degree  at  most  £,  and 
let  S  c  Z’j  be  a  set  of  assignments  for  which  d'  =  dn/  |<S|  >  ec.  Then  E^s  [|b(a)|]  < 

2«*d'Jf°)m\\b\\2,  where  c0  =  £  (^)  and  \\b\\2  =  ^E.^z»  [b  (x)2]. 

Proof  of  Lemma  3.  The  set  S  contains  1/d'  fraction  of  points  in  Z".  Therefore, 


Pr[|6(x)|>tW2]<|exp(-T. 


f2/f)  , 


for  any  t  >  (  s/lejTj  .  For  any  random  variable  Y  and  value  a  e  IR, 


f 


E[Y]  <a+  I  Pr[Y  >  t]dt. . 

#\*/2 


We  set  Y  =  \b  (cr)|  /||b||2  and  a  =  Assuming  that  a  >  (  y/lejTj  We  get 


^a~s\Mo)\\ 

m\2 


<  (\nd' /co)t/2  + 


f  d--e-^(dt 

J (In  d'lc0)e/2  # 


(Ind'/coY12  +  l-d! 


d{ 


2  de  ■  ci/2 


/-\00 
J  In  d' 


e~zzei2~xdz 


(In  d'/coY12  +  t'd’ 


de 


2  d{  ■  c, 


,e/2 


(- e-zz{l2~l ) 


< 


(\n  d'/coY 12  t'd' 


de 

(In  d'/coY12 
d{ 


+ 


2  df  • 


\CI  21-1 

E 


0  {'=1/2 
h/21-1 


In  d' 


£'\ 


r%00 

+  (£/2  -  1)  e~zzm~2dz 

Jlnd’ 


In  d' 


+ 


R721L,  Jls.  2(lnrf'/Co)"2 


2dr-ctn  2-1  t’\ 

Ln  //_n 


E  * 


'0  £'=0 

where  we  used  the  condition  d'  >  to  obtain  the  last  inequality 


dc 


□ 


Lemma  7.  Let  S  c  {0, . . . ,  d  -  1}"  be  a  set  of  assignments  for  which  d'  =  d"  /  |*S|.  Then 


E, 


a~5 


(a) 


< 


4(lnd7co//2||Mu)||2 
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Proof.  For  simplicity  of  notation  we  set  b  =  b(.  By  Parseval's  identity  we  have 
Ea~z”  b  (o)  b  (o)  =  [| b  (u)|2] 

=  ^  E  Y  QaXa  (a  ( C())h[  (Q) 

«eZ*  C{EX( 

H{a)=C 

=  ^  ^  \Qa\  he  (Q) 

ae Z,kd  C(EX{ 

H(a)=C 

=  IXd  E  IQ«|2w;t  E  MQ) 

ae  Zj  1  1  QeXe 

H{a)=C 

=  m  Y  |q«|2iimi| 

«eZ * 

H{a)=e 

<  |X,|  llftflll . 

Before  we  can  apply  Lemma  3  we  must  address  a  technicality  The  range  of 
b  =  b(  might  include  complex  numbers,  but  Lemma  3  only  applies  to  functions 
b  with  range  R.  For  c,  d  £  R  we  adopt  the  notation  Im  (c  +  d  V— l)  =  d  and 
Re(c  +  d  V^T)  =  c.  We  observe  that 

E„-z;[f>(<T )6(ff)]  =  K.;  ./,  | Re  (!> (of) 1  +  Im(b (a))2 

=  ||Re(i>)ll2  +  l|Jm(f<)ll2. 

We  first  observe  that  Re  ( b )  and  Im  ( b )  are  both  degree  t  functions  because  we 
can  write 


Re  ( b  (cr))  =  E  E  Re(QaXa  (o ( Ce))h( (Q)) 

(  aeZkd:H(a)={  C(eX( 

and 

Im  (b  ( a ))  ==  —  Yj  Yj  Im  (&aXa  ( C( ^  h{  • 

^  aeZ.*:H(a)={  C{eX{ 


201 


Now  we  can  apply  Lemma  3  to  get 

/  £ /2 

E„^[|fe(M<7))l]  <  2(lnd'JCl,)  -||to(i,)||2 

2  (In d' /coY 12  ,  Xll 

^  - -j-{ - y/\Xe\\\he(a)\\2. 

A  symmetric  argument  can  be  used  to  bound  E(J^5  [Im  (b  (cr))].  Now  because 

\b(o)\  <  \Re(b(o))\  +  \Im(b(o))\  , 

it  follows  that 


E 


o~S 


<  /  1  \/2(lnd7c0//2, 

“  \\X<\)\  d* 

<  4Qnd’/c0f2\\ht(o)\\2 

df  VjX^ 


\\he(o)\\2^ 


□ 


We  will  use  Fact  6  to  prove  Lemma  8.  The  proof  of  Fact  6  is  found  in  [73, 
Lemma  7],  We  include  it  here  for  completeness. 

Fact  6.  [73]  If  h  :  X(  — >  R  satisfies  ||/z|||  =  1  then  \\he ||2  <  1. 


Proof.  First  notice  that  for  any  Q, 

|{C  e  X*  1 3S  c  [k],  s.t.  |S|  =  t  A  C|S  =  C,j|  =  j^j . 

By  applying  the  definition  of  h(  along  with  the  Cauchy-Schwartz  inequality 


WMl  =  Ec(~Uf[hAQf 


A 

v  > 

2~ 

)  EQ~uf 

E  h  (c) 

/ 

^Sc[kl\S\={,CeXk,qs=C{  , 

< 


< 


E 


/|Xd\2 

\\xk\l 

(— )e 
l|Xd/ 


Ce~Ue 


\x^ 

\Xe\ 


C(~U( 


E  h  (C) 

\SQ[k],\s\=e,cexk,qs=c{  ) 


E 


h(Cf 


LVSc[/c]/|S|=^CeXt/C|S=Cf  ) 


TEc~uk[h(Cf 


II2  =  1 
h  L  ■ 
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□ 

Lemma  8.  Let  Q  be  a  clause  distribution  with  distributional  complexity  r  =  r(Q),  let 

ID'  C  {Qa}ae|0 . d-l}n  be  a  set  of  distributions  over  clauses  and  d'  =  dn/\D'\.  Then 

K2(V',Uk)  =  Ok((lnd'/ny /2) 


Proof.  Let<S  =  {cj  |  £  D'}andlet/z :  Xk  — >  Rbe  any  function  such  that  ~EUk  [h2]  =  1. 

Using  Lemma  5  and  the  definition  of  r, 


I A  (o,h)\ 


We  apply  Lemma  7  and  Fact  6  to  get 


]Eff~.s  [|A  (o,h)\\ 


<  ±  U(\nd'/c0f2\\hf(o)\\2\ 

d(  VKi  / 

<  y1  /4(ln  d'  /cqYI2\ 

M  dH poi  I 

(In  d'Y12 
denrl2 


□ 

Remark  6.  Recall  that  denotes  the  uniform  distribution  over  pairs  ( C,i )  £  Xk  x  Z ^ 
f/zaf  satisfy  f  (a  (C))  =  z.  we  Jef  If'  denote  the  uniform  distribution  over  Xk  x  Zrf  f/zen 
for  any  function  h  :  Xk  x  Zrf  — »  1R  we  can  apply  Lemma  8  to  write 


E(C,;)~Qa  PKC  /)]  -  E(C,;)~14  [MC  /)]  - 


< 


< 


where  hl  (C)  =  h  (C,  i). 


V(q]) 
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Reminder  of  Theorem  7.  Let  Zq,£  denote  the  problem  of  finding  for  every  o  £  Z" 
an  assignment  %  e  Z*  that  is  e-correlated  with  o  given  access  to  distribution  over 
Xk  x  Z  d-  Then  there  exists  a  constant  cq  >  0  such  that  for  any  e  >  1/  yfn  and  q  >  n, 


SDN 


z, 


Q,et  ' 


cq  (log  q) 


r/2 


n'V2 


-,2e_"'e2/2 


> 


w/zerc  r  =  r(f)  is  the  distributional  complexity  of  f. 


Proof  of  Theorem  7. 


We  use  the  uniform  distribution  U',  over  Zl,+1  as  our  reference  distribution 

k  a 

and  we  use  Du  =  D  =  [Q{j }aez.n{  to  denote  the  set  of  distributions  for  all  possible 
assignments.  First  note  that,  by  Chernoff  bounds,  for  any  solution  t  e  Z'j  the 
fraction  of  assignments  o  £  Z"  such  that  t  and  o  are  e-correlated  (e.g.,  H  ( a ,  t)  < 


—  e  ■  n)  is  at  most  e  2n'e".  In  other  words  \D%\  >  (l  -  e  2ne2^j  \D\,  where  Dx  = 
D\Z~nX T)-  LetD'  c  Dj  be  a  set  of  distributions  of  size  \DT\/q  and  S  =  [o\q{,  e  D'}. 


Then  for  d' 


d" /\D'\  =  q  ■  dn  j\D%\,  by  Lemma  8  and  remark  6,  we  get 


k2{D' ,  L/[)  =  Ojt 


Oi 


'(In  d')r^ 


n 


r/2 


l(\nq)r/2\ 

\  nr/2 


where  the  last  line  follows  by  Sterling's  Approximation 


(8.1) 

(8.2) 


q  =  d'\DT\/dn  =  d'\DT\/dn  «  d'c' 

for  a  constant  c' .  The  claim  now  follows  from  the  definition  of  SDN.  □ 

The  proof  of  Theorem  6  follows  from  Theorem  7  and  the  following  result  of 
Feldman  et  al.  [73]. 

Reminder  of  Theorem  5  [73,  Theorems  10  and  12].  Let  Xbe  a  domain  and  Zbe  a 
search  problem  over  a  set  of  solutions  T  and  a  class  of  distributions  D  over  X.  Tor  k  >  0 
and  q  e  (0, 1),  let  d'  =  SDN(Z^,  k,  if).  Let  D  be  the  reference  distribution  and  Du  be  a  set  of 
distributions  for  which  the  value  d'  is  achieved.  Any  randomized  statistical  algorithm  that, 
given  access  to  a  VSTAT^^^J  (resp.  1-MSTAT  (L))  for  a  distribution  chosen  randomly 
and  uniformly  from  Dd,  succeeds  with  probability  A  >  q  over  the  choice  of  distribution 
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and  internal  randomness  requires  at  least  jz~d'  (resp.  Q  ^  min 
to  the  oracle. 


d'(  a--i) 

i-i]  ' 


calls 


Reminder  of  Theorem  6.  Let  a  £  Z"  denote  a  secret  mapping  chosen  uniformly 
at  random  and  let  Z,Qf  he  a  planted  constrained  satisfiability  problem  with  distribution 
q{  over  Xk  x  Z where  f  has  distributional  complexity  r  =  r(f).  Any  randomized 

statistical  algorithm  that  finds  an  assignment  z  such  that  z  is  j- correlated 

with  a  with  probability  at  least  A  >  q  over  the  choice  of  o  and  the  internal  randomness  of 

the  algorithm  needs  at  least  m  calls  to  the  1-MSTAT(L )  oracle  (resp.  VSTAT  —  ) with 

\2(l°gf!j  / 


m  ■  L  >  C\  (resp.  m  >  nCllo%n)  for  a  constant  C\  =  Llklj^  ,  In  particidar  if  we 

/  \  ?'/2  /  \r/2 

set  L  =  (j^J  then  our  algorithms  needs  at  least  m  >  C\  j  calls  to  l-MSTAT(L). 


Proof  of  Theorem  6.  We  set  e  =  ^  ancj  observe  that  2e  ne2jl2  =  q,  and  we  set 

q  =  nlog”  in  Theorem  7.  Notice  that  we  used  Du  =  {Q((},,ez;;  in  the  proof  of  Theorem 
7.  Now  we  apply  Theorem  5  to  get  the  desired  lower  bound 


— 21n(r,/2) 


( 


m  =  Q 


mm  < 


n°k(losn)  (A  -q) 
1-q 


n 


r/2 


Cq  (log  (nl0S")) 


r/2 


(A  -  qf  /L 


=  Q 


n 


dog  2rn 


\/L 


for  the  1-MSTAT  (L)  oracle.  For  the  VSTAT  (  )  oracle  we  get  m 

\2(l°g n)  J 


n0k(lo8n)(A-r]) 

1-j] 


□ 


8.3  Security  Proofs 


Reminder  of  Theorem  8.  Let  f  be  (blr  bf) — hard  to  predict,  let  a  ~  Z”  denote  the 
secret  mapping,  let  e  >  0  be  any  constant  and  suppose  that  we  are  given  labels  tc  e  Z d 
for  every  C  £  X *  s.t 

Pr  [f(o(C))  =  Lc\>]  +  b2  +  e. 

There  is  a  polynomial  time  algorithm  (in  n,  1/e,  l/b2)  that  with  high  probability  finds  a 
mapping  o'  £  Z”  such  that  o'  is  bi-correlated  with  a  provided  that  a  is  b\-balanced. 
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Proof  of  Theorem  8.  Let  o  €  Z”  be  given  such  that  a  is  Sj-balanced  (e.g.,  < 

max;6Zd  (^)  <  +  h- 

We  set  t  =  4  lri  (T)  and  select  clauses  C:j, . . .  C{  at  random  for  j  e  [y\.  Then  by 
Chernoff  bounds 

Pr  {*  ed  =f(°(C’i))}  ~  \  +  Tb2  ~  exp(2^1n(T)e2) 

=  exp  (4  In  (T)) 

J_ 

_  p4  ' 

Let  T  =  n,  set  S'  =  IJ/=i  C-/  and  define  BAD'  to  be  the  event  that 

{CeXk\ccSi  Atc  =  f(o(C))}\  1 

-  ^  —  “1“  62  • 

|{C  e  Xjt  |  C  c  S'}\  ~d 

By  the  union  bound 

J_ 

p4 

c  T 
62kT~ 

1 

<  —  . 

n3 

If  we  set  i/  =  0(n  log  n)  then  with  high  probability  ,  S-1  =  [;/].  By  applying  the 
union  bound  again  we  have 

Pr[3jely].BADi]<±. 

Now  for  each  j  e  [y]  we  can  enumerate  over  all  n°((2klnd)/6 %)  mappings  o!  e  z|fS  ^  to 
find  one  that  satisfies 

{C€Xk\CQSiA€c  =  f(o>(C))}\  1 

|{C  €  X)t  |  C  C  Si}  |  -  d+ 62 ' 


Pr  [BAD']  <  [  e'~  j 


2kkk  In' 
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in  polynomial  time.  Because  /  is  (Si,  <52)-hard  to  predict  o’  (Sjj  must  be  <5i-correlated 

with  a  (S^j  (e.g.,  —  <  ff  -  Si).  Now  we  can  find  o'  s.t.  H  (o,  o')  <  ^  -  Si 

by  combining  the  ov s.  □ 

Reminder  of  Claim  1.  Let  be  an  adversary  s.t  Pr  [Wins  m,t )]  >  (i  +  6  +  e)f 

and  let  b  =  J\.cx,...,cm  ^ien 


.  Pr  Kq . c;  (C  0  =  /  (a  (Q)  P,q,  ,c;  (C,0  *  J-l  >  ( \  +  6  +  e 


Clf...,Cm~Xk 

q . q~x* 


Proof  of  Claim  1.  We  draw  examples  (C\,f{o  (Ci))) , . . . ,  ( Cm,f(o  (Cm)))  to  construct 
&  =  Given  a  random  length-f  password  challenge  (Cf ... ,  C')  €  (Xf)‘  we 

let 

»  =  c.c . c,Py..q-x,  . « <0  =  /<*  (O)  |  n„q . q  (O  *  ±] 

denote  the  probability  that  the  adversary  correctly  guesses  the  response  to  the  )’ th 
challenge  conditioned  on  the  event  that  the  adversary  correctly  guesses  all  of  the 
earlier  challenges.  Observe  that 

cc, . c,qPrc;  ^  f** . <c'  '■>  =  / <C))]  =  t  P‘/‘ ' 

1=1 

so  it  suffices  to  show  that  Yfi=\  Pi/ 1  -  ]+  5  +  e.  We  obtain  the  following  constraint 


n*  =  lice . c.q . q-x,Kc; . c:(C)=/(a(C))  |n„q . q  (O  *  ±] 

1=1  !=1  1 

=  lie . . . c0m=/Hc0)I 

1=1  1 

. c,  (c; . C;)[;]  =  /(o(c'))] 

=  c, c  Jl,CrXl  [*C: c„  (C c;)  =  (/  (a  (c;)) . /  (a  (Q))] 

*  (i+6;e; 
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If  we  minimize  jfi=1  Pi/t  subject  to  the  constraint  n!=i  Pi  >  (y  +  i)  +  e)  then  we 
obtain  the  desired  upper  bound  £-=1  Pi/t  >  \  +  6  +  e.  □ 

Reminder  of  Theorem  9.  Suppose  that  f  is  (Si,  62) — hard  to  predict,  but  that  f  is 
not  UF  -  RCA  (n,m,t,  6)  -  secure  for  5  >  +  62  +  e)  .  Then  there  is  a  probabilistic 

polynomial  time  algorithm  (in  n,  m,  l/b\,  1/62,1/e)  that  extracts  a  string  o'  e  Z"  that  is 
c-correlated  with  a  after  seeing  O  (m)  examples,  where  c  >  0  is  a  constant. 

Proof  of  Theorem  9.  (sketch)  We  first  partition  Xk  into  T  =  O  ( lc>g(|X,~l )  sets  Si, . . . ,  ST 
of  equal  size.  We  let  Uj  denote  the  set  of  unlabeled  clauses  from  S  ,  at  time  i.  Initially, 

Uq  =  Sj.  During  step  i  we  draw  mT  labeled  examples  (cf ,  f  (o  , . . . ,  ( 'c'f ,  f  (o  ~ 

qI  and  t  labeled  examples  (^Cf,  f  (a  ,  •  •  • ,  (cf ,  f  (a  ~  Q^.  For  each 

clause  C  G  U)  we  select  kc  ~  [t]  uniformly  at  random.  We  set 


U’+1  =  {C€Ul 


'^b.kcC/ . cip  (C)  -L  i  / 


and  we  set 


-  ^bXc.cp . c1;’ 


for  all  C  G  Ul\UJi+r  Here,  b  =  qn. 

We  first  argue  that  O  ( j  rounds  suffice  to  label  every  clause  C  G  Xk.  Notice 


that  VC  G  Xi-  we  have 


Pr 

Hf] 


*P :  Mi  r‘,i  (C)  +  -L 


w . c; 


>  Pr  [/  =  1]  =  -  . 

HA  t 


c'/ . c';{~xk 

*i,j  fiA 


The  probability  that  a  clause  C  hasn't  been  labeled  after  i  rounds  is  at  most  (l  -  y)  . 

By  union  bounds  the  probability  that  any  clause  is  unlabeled  is  at  most  \Xk\  (l  -  y) 

so  after  i  =  O  j  rounds  we  will  have  111  =  0  for  all  j  G  [T], 

We  now  argue  that  with  high  probability  we  label  at  least  \  +  6  +  e/2  clauses 
correctly.  Formally,  let 

xi  =  \{CeSl\{c  =  f(o(C))}\  , 
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denote  the  number  of  clauses  in  S,  labeled  correctly.  Notice  that  the  random 
variables  are  independent,  and  by  Claim  1  we  have 


Now  by  Chernoff  Bounds  it  follows  that  with  probability  1  -  o(l) 


so  we  can  apply  Lemma  8  to  obtain  the  desired  result. 


□ 


8.4  Proofs  of  Claims  and  Facts 

Reminder  of  Claim  2.  r{ff)  =  3,  g(f\)  =  2  and  s(/i)  =  3/2. 

Proof  of  Claim  2.  We  first  observe  that  if  we  fix  the  values  of  Xi0,Xii  G  Zi0  then 
f  (xo, . . . ,  Xg, X12,  X13)  =  /1  (xo, . . . ,  X13)  is  a  linear  function.  Thus,  g{f\)  =  2.  We  also 
note  that  for  any  a  G  s.t.  H  (a)  <  3  and  z,  f  G  Zi0  that 

1 

Pr  [fi(x)  =  f  I  a  •  x  =  i  mod  10]  =  Pr  \f  (x)  =  f]  =  —  . 

*~zlo  *~zlo  10 

Therefore, 

c3i’-'  =  K^z;0[Q/"'W^«W] 

9 

=  ^  Pr  [a  -  x  =  i  mod  10]  Ex~z*  \Q'fl,t  ( x )  (x)  a-x  =  i  mod  lo] 

1=0 
9 

=  Yjex  p 

1=0 


\Qfl,t  (x)  a  •  x  =  i  mod  lo] 

=  0, 


~Yq — -  j  Pr  [a  •  x  =  i  mod  10]  ^x~z.kw  [ Q ^1,f  (x)  a  •  x  =  z  mod  lo] 
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which  implies  that  r(f)  >3.  □ 

Reminder  of  Claim  3.  r(/2)  =  4,  g(f2 )  =  1  and  s(/2)  =  2. 

Proof  of  Claim  3.  We  first  observe  that  if  we  fix  the  values  of  X\q  e  Z10  then 
/'  (x0, . . .  ,x9/x11,x12,x13)  =  f2  (x0/ . . .  ,X\f)  is  a  linear  function.  Thus,  gif)  =  1.  We 
also  note  that  for  any  a  £  s.t.  H  (a)  <  4  and  z,  t  £  Zi0  that 

1 

Pr  [/2(x)  =  1 1  a  ■  x  =  z  mod  10]  =  Pr  [/2(x)  =  f]  =  —  . 

x~z;^  10 


Therefore, 

QP  =  K*-z*  [Q/"' W] 


-2ra  V-l'' 


^  Pr  [a  •  x  =  i  mod  10] 

i=0 
9 

Lexp. 

i=0  ' 

1  y 

IoLexP 


10 

-2ni  V-l 


[Q/l/f  (*)*«  (*)  a  •  x  =  z  mod  loj 
Pr  [a-  x  =  i  mod  10]  [(V1,f  (x)  a  -  x  =  i  mod  lo] 


)  ^x~zkw  [Qfl,‘  (x)  a  '  x  =  /  mod  lo] 


z=o 


□ 


10  / 

t— u  ' 

=  0, 

which  implies  that  r(/2)  >  4. 

Reminder  of  Fact  2.  _/]  zs  (0.01, 0.045) — hard  to  predict. 

Proof  of  Fact  2.  Let  a,  o'  £  Z"0  be  given.  We  assume  that  a  is  ( ^)-balanced  and  that 
a  and  a’  are  not  (^-correlated.  This  means  that  ^  <  ma xiezj”  -  ^ 
and  (l  -  <  TL_  h  suffices  to  show  that 


n 


<  — 

n  I  -  100 


For  j  £  Z10  we  let  pj  =  H^’fu  1  where  Oj  (z)  =  (a  (z)  +  j  mod  10).  In  particular, 
p0  =  " ~Hff°  ]  denotes  the  probability  that  o(i)  =  o'(i )  for  a  random  index  z  ~  [n\.  By 
assumption  p0  <  E  +  ^L.  We  have 
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Pr  [/i  (o  (O)  =  /i  (o'  (C))] 

C-(X1 . Xi4)~Xi4 


E 


ViPj 

/j'eZio 
\i+j=0  mod  10  ) 


Pr  [/i  (o(C))  ^  Mo'  (C))| 

C=(xlr...,xu)~Xu 


+ 


o  (xn)  +  o  (X12)  =  o'  (xn)  +  o'  (X12)  mod  10] 

/  \ 

I 

z+y=0 

cr  (xn)  +  o  (X12)  £  o'  (xn)  +  o'  (X12)  mod  10] 


PiPi 

i,;eZio 

1  z+y=0  mod  10  1 


Pr  [f1(o(C))=f1(o'(C))\ 

C=(xlr...,xu)~Xu 


/  \ 

3 

Z  ^ 

Pr 

(xi,X2^3)~X3 

^  (a  (x*)  -  cr'  (x*))  =  0  mod  10 

i,jeZw 

Ki+j= 0  mod  10  , 

.  z=l 

+ 


1  - 


E 


PiPi 

yeZio 
\i+j=0  mod  10  )) 


Pr  [o  ( x )  +  o  (xi3)  +  o  (xu)  = 

{x,y,xiz,xu)~Xu 


o'  (y)  +  o'  (xi3)  +  o'  (xi4)  mod  10] 


E 


PiPi 

ifje  Zio 
\i+j=Q  mod  10  ) 

I  / 


Pr 

(xl,X2,Xi)~X3 


\ 

Y  VWk 

; i,j,keXig:i+j+k=0  mod  10 


+ 


Yj  Wi 

ifje  Zip 

\i+j=0  mod  10  11 

a'  (y)  +  o'  (x33)  +  o'  (x14)  mod  10] 


Pr  [a  (x)  +  o  (xy3)  +  o  (x44)  = 

(x,y,xl3,x  i4)~Xi4 


/  \ 

/  \ 

Z  ^ 

Z 

ijeZio 

i,j,keZig 

j+j=Q  mod  10  , 

J+j+k=0  mod  10  , 

+ 


Y  w 

yeZio 
\i+j=0  mod  10  11 

\  / 


modl0] 


E  m 

ijeZio 
\i+j=0  mod  10  1 


E 


PiPiPk 

ifjfks  Zjo 
\i+j+k=0  mod  10  1 


+ 


Z  w 

!,yeZ10 
\i+j=0  mod  10  11 


Maximizing 


/ 


\  / 


\  /  / 


w 


E 


W 


/,/eZ10 
\i+j= 0  mod  10 


E 


!/  j,k€Zi  0 
\i+j+k= 0  mod  10 


+ 


E 


i,jeZ,w 
\i+j=0  mod  10 


// 


subject  to  the  constraint  that  p0  <  ^  +  d^,  we  obtain  the  desired  upper  bound 


E 


PiPi 


!,/eZio 
\i+j=  0  mod  10 


E 


/W* 


0 7/teZxo 
\i+j+k=  0  mod  10 


+ 


1  - 


E 


(,/eZio 
\i+j= 0  mod  10 


11 


< 


145 


100/  “  1000 


□ 


Reminder  of  Fact  3.  /2  zs  (0.01, 0.01) — /zzzrd  to  predict. 

Proof  of  Fact  3.  Let  a,  o'  £  Zj'()  be  given.  We  assume  that  cr  is  (yd-j  j-balanced  and 
that  o  and  o'  are  not  (pjo)-correlated.  This  means  that  loo  d  maxf6Zl0  < 

and  (l  -  -)  <  dd.  It  suffices  to  show  that 


li 

100 


c=fa,pt.MlT2<ff(C),=/2(a'(C,)]siS5 


For  /  £  Z10  we  let  pj  =  1  where  Oj  (z)  =  (o  (z)  +  j  mod  10).  In  particular, 

p0  =  " ~Hjf°  }  denotes  the  probability  that  o(i )  =  o'(i )  for  a  random  index  z  ~  [n\.  By 
assumption  p0  <  ^  +  ^L.  We  have 


212 


Pr  [f2(o(C))=f2(o'(C))] 

C=(x1,...,xu)~Xu 

=  Pr  [a  (in)  =  o'  (xn)]  Pr  [f2  ( o  (C))  =  f2  (o'  (C))  |  a  (in)  =  o'  (in)] 

C=(x1,...,xu)~Xu  C=(xi,...,x14)~X14 

+  Pr  [o  (in)  +  o'  (in)]  Pr  [f2  (o  (C))  =  f2  (o'  (C))  |  a  (xn)  *  o'  (in)] 

C-(xi,...,xj4)~X14  C-(x1,...,xi4)~Xi4 

=  p0  Pr  [/,  (o  (C))  =  f2  (o'  ( C ))  I  (7  (xn)  =  (7  (xn)] 

C-(x1,...,xi4)~Xi4 

+  (1  -  po)  Pr  [f2  (o  (C))  =  f2  (o'  (C))  I  (7  (xn)  +  (7  (xn)] 

C=(x1,...,Xl4)~Xl4 

=  po  Pr  [a  (X12)  +  . . .  +  o  (X14)  =  o'  (xi2)  +  . . .  +  o'  (X14)  mod  10] 

C=(xi2/x13/x14)~X3 

+  (1  -  po)  Pr  [f2  (o  (C))  =  f2  (o'  ( C ))  I  o  (xn)  +  o  (xn)] 

C=(xi,...,Xl4)~Xl4 

=  Po  Yj  p,pifk 

i,  jJce'ZLiQ 

i+j+k=0  mod  10 

+  (1  -  po)  Pr  [ f2  (o  (C))  =  f2  (o'  (C))  |  a  (xn)  ±  o  (xn)] 

C— (X1,...,X14)~X14 

=  Po  Y  p,pipk 

i,j,keX  10 

i+j+k= 0  mod  10 

+  (1  -  po)  Pr  [cr(x)  +  o(x i2)  +  (7  (X13)  +  o  (xu)  = 

(x,y,X  12,Xi3/Xi4)~X5 

o'(y)  +  o'(xi^)  +  o'  (xi3)  +  o'  (x14)  mod  10] 

<  po  Y  ppjPk  +  (1  -  Pio)  max  Pr  [a(x)  =  j  mod  10] 

.  je Z10  x~[n] 

i,J,ke  Zio 

i+j+k=  0  mod  10 

^  Po  Y  PPiPk +  0-~P™)(^)  ■ 

i,  ;X'6Zio 

i+j+k=  0  mod  10 

Maximizing 

Po  Y  p,pipk  +  Po)  (15)  ' 

i,  j,k&^io 

i+j+k=  0  mod  10 

subject  to  the  constraint  that  p0  <  ^  we  obtain  the  desired  upper  bound 
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PO  Yj  w*  +  (i-Po)Q<555- 

i,j,ke  Zio 

i+j+k= 0  mod  10 

□ 


8.4.1  Security  Upper  Bounds 


Background  The  proof  of  Theorem  10  relies  on  the  discrete  spectral  iteration 
algorithm  of  [73].  We  begin  by  providing  a  brief  overview  of  their  algorithm.  In 
their  setting  the  secret  mapping  o  is  defined  over  the  binary  alphabet  Z”.  Let 

k\  =  r^r],^  =  L^J  and  let  <5  G  [0,2]\{1].  They  use  o  to  define  a  distribution  over 
|x^  |  x  |Xjt2|  matrices  Ma^iV  =  M  (Qo-a/>)  _  Jp,  where  /  denotes  the  all  ones  matrix. 
For  (Ci)  G  Xku  (C2)  G  Xk2  such  that  Ci  P|  C2  =  0  we  have 


m(q,Ap)[(C1),(C2)]  =  < 


1, 

1, 

0, 


with  probability  (p  (2  -  6))  if  Lyec1uc2  0  0)  -  0  mod  2 
with  probability  (p5)  if  Y  ject  uc2  0  (/)  ^  0  mod  2 
otherwise 


Given  a  vector  x  GG  {±l}lXfc2l  (resp.  y  GG  {±1  jlx,c'  I)  M0/I,x  defines  a  distribution  over 
vectors  in  ]RlXfcil  (resp.  Mjipy  defines  a  distribution  over  vectors  in  IRIx,::  I). 


If  r(f)  is  even  then  the  the  largest  eigenvalue  of  E  has  a  corresponding 

eigenvector  x*  G  {±l}Xr</}/2,  where  for  C,  G  Xr(jy2  we  have  x*  [C,]  =  1  if  L;ec,  °(j)  =  1 
mod  2;  otherwise  x*  [C;]  =  -1  (if  r(f)  is  odd  then  we  consider  the  top  singular 
value  instead).  Feldman  et  al.  [73]  use  discrete  spectral  iteration  to  find  x\  Given 
x*  it  is  easy  to  find  o  using  Gaussian  Elimination. 

The  discrete  spectral  iteration  algorithm  of  Feldman  et  al.  [73]  starts  with  a  random 
vector  x°  G  [0,  T }  lXfc2 1 .  They  then  sample  x!+1  ~  M0/I,x'  followed  by  a  normalization 
step  to  ensure  that  x!+1  G  [0,  l]lXfc2l.  When  r(f)  is  odd,  power  iteration  has  two  steps: 
draw  a  sample  1/  ~  MaApxl  and  sample  from  the  distribution  x,+ 1  =  Mjjbp}f.  They 


showed  that  O  (log  |Xr(/)|)  iterations  suffice  to  recover  a  whenever  p 


£igg faflj 
(5-D2^d' 


and  that  for  a  vector  x  G  [0,  l}lXfc2l  (resp.  y  G  { ± T } I Xfci  I )  it  is  possible  to  sample  from 
MaApx  (resp.  Mj6py)  using  O  (1/p)  queries  to  1-MSTAT  (|Xfcl|). 
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Our  Reduction  The  proof  of  Theorem  10  uses  a  reduction  to  the  algorithm  of 
Feldman  et  al.  [73]. 

Reminder  of  Theorem  10.  For  f  £  {fi,  fi)  there  is  a  randomized  algorithm  that 
makes  O  ^nmax^  Tfi)/2}  log2rz)  calls  to  the  1-MSTAT  (nT(fi)/2^  oracle  and  returns  o  with 
probability  1  -  o(l). 

Proof  of  Theorem  10.  (sketch)  Given  a  mapping  o  £  Z”  and  a  number  i  £  Z^  we 
define  a  mapping  Oj  £  Z^  where 


0i{j)  =  K  if  <7  (;)  =  / 

1 0,  otherwise 

Clearly,  to  recover  a  it  is  sufficient  to  recover  Oj  for  each  i  £  Z,;.  Therefore,  to 
prove  Theorem  10  it  suffices  to  show  that  given  x  £  {il}!^!  (resp.  y  £  { ±1 } I )  we 
can  sample  from  the  distribution  Maii^px  (resp.  Mf6  y)  using  0(1  /p)  queries  to 

1-MSTAT  for  each  i  £  [0, . . .  ,d  —  1],  where  1-MSTAT  uses  the  distribution 

Q{j.  In  general,  this  will  not  possible  for  arbitrary  functions  /.  However,  Lemma 
9  shows  that  for  our  candidate  human  computable  functions  f\,  f2  we  can  sample 
from  the  distributions  MaiApx  (resp.  MTa,b  y).  The  proof  of  lemma  9  is  similar  to 
the  proof  of  [73,  Lemma  10].  □ 

Lemma  9.  Given  vectors  x  £  {±\)\Xk^,y  £  {±l}lx*zl  we  can  sample  from  M0/5rPx  and 
Mfj5py  using  O  (n^fh/2  log2  n^j  calls  to  the  1-MSTAT  (Vr(A/2l  j  oracle  for  f  £  {/i,/2} . 


The  proof  of  Lemma  9  relies  on  Facts  7  and  8. 
Fact  7.  For  each  j,  t  £  Zi0  we  have 


Pr  [x12  +  *13  +  xt  =  j  |  /i  (a  (*0,  •  •  • ,  *i3))  =  ]  mod  10] 


f-l 

(10 1 

(A) 

1  + 

h) 

U) 

1 

m 

i 

19 

Too7 


and 


Pr  [x12  +  *13  +  xt  =  j  |  /1  (a  (*0/  •  •  •  /  *13))  T  ]  mod  10] 

(x0/...,x13  )~Z“ 


f-i 

(10 1 

+  A 

(0))(m) 

U) 

1 

9 

Too ' 
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Fact  8.  For  each  j,  t  £  Z10  we  have 


Pr  [xn+x12  +  x13  +  xt  =  j\f2(o{xo,...,x13))  =  j  mod  10] 

(X0,...,X13)~Z^ 


and 


Pr  [xn  +  X12  +  X13  +  xt  =  j\f2  {o  (x0/ ...,  X13))  £  j  mod  10] 

(x0,...,Xi3)~Z“ 


(ft  (4) +4)  (4)  19 

(±\  100' 


(ft  (ft) +4(0)  (ft) 

(ft) 


Proof  of  Lemma  9.  Let  x‘-  £  {0, 1}  denote  a  random  variable  that  is  1  if  and  only  if 
Xj  =  i.  For  fi  we  define  the  function  hl,+  :  Xu  X  Z|(l  — »  X^  U  {_L}  as  follows 


hh+  (x0/ ...,  x13,  /1  (a  (x0, ...,  *13))) 


j  (x0,  x12,  X13)  if fx  ( a  (x0,  •  •  • ,  X13))  =  3 i  mod  10 
|  J_  otherwise. 


For  f2  we  simply  change  the  condition  to  f2  ( o  (x0, ... ,  X13))  =  4 i  mod  10  for  hl,+ . 

Given  a  vector  x  £  {±l}lXfcil  we  query  our  1-MSTAT  (|X/t,  |  +  l)  oracle  |"10/pl 
times  with  the  function  ht,+  to  sample  from  MaApx.  Let  cjlr. . .  /t/pio/y  denote  the 
responses.  We  observe  that  for  C  £  Xk.2  we  have 

MaApx[C\  ~  x  ip]  ~  P  Xj  ' 

rio/pl  C'eXh 

qi=c 


for  some  6^1  because  by  fact  7  it  follows  that 


Pr 

(x0r...,x13hZf0 

x\2  +  x[3  +  x‘0  =  1 

mod  2  /1  (a  (xq,  . . 

■  •,*13))  =  3  i 

mod  lo] 

+  Pr 

ho . JTlshZj* 

x\2  +  x\3  +  x'0  =  1 

mod  2  fi  ( a  (x0/ . . 

■  •/*i3))  =  3  i 

mod  loj 

Similarly,  by  fact  8  it  follows  that 

Pr 

Xln  +  x\2  +  x\3  +  Xq  =  1 

mod  2  /1  (cr  (x0, . 

'A 

III 

mod  lo] 

+  Pr 

(x0,...,x13)~Zf0 

X\i  +  x\2  +  x\3  +^o  =  l 

mod  2  /1  (a  (x0/ . , 

H 

OJ  ^ 

III 

mod  lb] 

□ 
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9.1  Optimizing  Password  Composition  Policies:  Miss¬ 
ing  Proofs 


Reminder  of  Theorem  16.  For  every  k,  Algorithm  5.4  computes  arg  min:/(  p  (, k ,  TFT)  in 
the  singleton  rules  setting  of  the  normalized  probabilities  model,  in  time  0(Nlog(N)). 


Proof  of  Theorem  16.  Let  TFT  denote  the  optimal  solution,  denote  its  most  k  popular 
passwords  as  zvt] , . .  .,Wjk,  and  denote  also  P*  as  the  total  probability  mass  of  the 
words  in  TFT  according  to  the  initial  distribution:  P*  =  Pr[w].  Therefore, 

p{k,FT)  =  E-=iPr  [Wif/PT 

Clearly,  all  words  Wj  s.t.  j  >  ik  belong  to  TFT  -  otherwise,  we  could  add  such 
a  word  and  decrease  the  probability  of  the  top  k  words.  Similarly,  all  words  Wj 
s.t  j  <  i\  must  not  belong  to  TFT,  otherwise  they  would  belong  to  the  set  of  most 
popular  k  words.  We  now  claim  that  w,u . . . ,  Wik  are  k  consecutive  words. 

Suppose  that  there  was  some  word  w'  between  some  zvlj  and  wy.+1.  Then  TFT 
clearly  banned  it,  otherwise  it  would  be  one  of  the  most  popular  k  words.  We  claim 
that  the  policy^'  where  we  ban  ity  and  allow  wF  instead  satisfies  p(k,TFT)  <  p{k,TFT). 

We  denote  p\  =  Pr[uy  ],  q  =  EJ=2  Pr[w,,]  and  p’  =  Pr [zv']r  and  we  know  p\  >  p'. 
Then  p(k,Fl*)  =  (pi  +  q)/P*,  whereas 


p{k,TFT) 


V’  +  4 

P*  -  pi +p' 


Our  goal  is  to  show  p(k,TFT)  <  p(k,  TFT),  which  holds  iff 


03'  +  q )p*  ^  (Pi  +  q)(P*  ~  ( Pi  ~  p')) 


By  some  algebraic  manipulations,  this  holds  iff 

(pi  -  p')p*  >  (pi  -  p')(pi  +  q) 

which  clearly  holds  because  p\  -  p'  is  a  non-negative  quantity,  and  pi  +  q  = 
L% i  Pr [Wij]  <  Lzoew  PrM. 

As  for  the  running  time  of  the  algorithm,  it  is  obvious  that  sorting  requires 
0(N  log  N)  time.  Finding  the  minimum  requires  only  O(N)  time:  if  we  denote 
ai  -  Li<j<i+k  Pr [zvj]  and  bt  =  fii<j  Pr[w;],  then  based  on  a,  and  bt  it  is  easy  to  compute 
ai+i  and  bi+i  in  0(1)  time.  □ 
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Reminder  of  Lemma  4.  Fix  m  and  s  such  that  m  >  s.  There  exists  a  domain  D  of  size 
0(s2log(m))  and  a  family  of  m  sets,  Fi,F2,  .  . .  ,Fm  c  D,  such  that  each  set  in  the  family 
contains  ^  elements,  and  for  every  C  c  [m]  of  size  |C|  <  s,  we  have  that  the  size  of  the 
union  ||J/eCF;|  ^  97  This  domain  can  be  constructed  in  randomized  poly (s,m)  time. 

Proof  of  Lemma  4.  Given  m  and  s,  we  first  pick  a  random  function  <p  :  [m]  — >  [2s]. 
Fixing  a  subset  C  c  [m]  of  size  |C|  <  s,  we  claim  that  |0(C)|  >  |C|/2  w.p.  at  least 
1  -  (0.825)|CL  Indeed, 

Pr  [|0(C)|  <  ICI/2]  <  Pr  [BP  c  [2s]  s.t.  |F|  =  |C|/2  and  Vz  G  C,<p{i)  €  f] 


So  assuming  |C|  >  8  we  have  that  C  is  mapped  to  at  least  |C|/2  distinct  images  by  <p 
w.p.>  3/4.  Also,  if  |C|  <  7  then  probability  of  even  two  elements  getting  mapped 
to  the  same  image  is  at  most  QjT  <  0.25  for  s  >  42. 

We  now  construct  D  by  taking  d  independently  chosen  such  0-mappings, 
which  we  denote  as  0i,02,  ■  ■  ■  ,<fd,  and  so  D  =  [2s]  X  [d].  We  construct  the  family 
F;  =  {(0i (i),  1),  (02(O/  2), . . . ,  (<pd(i),  d)}  for  every  i  G  [m\.  Clearly,  for  every  i  it  holds 
that  |F,-|  =  d  =  |D|/2s.  Supposed  for  the  sake  of  contradiction  that  there  exists  some 
C  c  [ m ]  of  size  <  s  such  that  |U;eC^|  -  ^|F/|.  By  construction,  we  have  that 

d  d 

uf, =j:i(Wq./)ii =5:^01 

ieC  j=  1  j= 1 

so  by  the  Markov  inequality  we  have  that  at  least  d /2  functions  where  the  cardi¬ 
nality  of  the  image  of  C  is  less  than  |C|/2.  Let  Xq  be  the  indicator  random  variable 
of  ( pj  mapping  the  set  C  to  no  more  than  |C|/2  distinct  elements,  the  Hoeffding 
bound  gives  that 

Pr  [3C  of  size  <  s  s.t.  XC/j  >  d/l]  <  ^  (;”)  Pr[^  ^  XQ  >  °-5]  ^  m°{s)e-d/w 

s'<s  '  ’  j 

Setting  d  =  0(s  log  m)  gives  that  w.p.  >  1/2  no  such  C  exists.  □ 

Reminder  of  Theorem  18.  Unless  P  =  NP  there  is  no  polynomial  time  algorithm 
(in  N,m,n)  which  outputs  arg  min5c[,»|  p  (k,  Xls)  in  the  positive  rules  setting  and  the 
normalization  model. 
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Proof  of  Theorem  18.  Our  reduction  is  from  set  cover. 

Set  Cover  Instance:  Sets  Si, ,  Sm,  Universe  U  -  {1, . . . ,  n)  and  integer  k. 
Question:  Is  there  a  set  cover  of  size  k—  1? 

Now  we  define  Wlr . . . ,  W„  to  be  n  disjoint  sets  of  passwords 


1  <  £  <  n5m5)  . 


We  also  define  special  passwords  tj  (j  <  m)  and  t j  (j  <  k)  which  are  not  contained 
in  any  W,-. 


We  define  the  following  positive  password  rules: 


Ri  =  {ti}  U  j  Tj  1 1  <  j  <k]+  [J 


W,-. 


j-.jeSi 


We  assign  probabilities  as  follows: 


PrK,]  =  (l-- 

for  each  i  <  n  and  £  <  m5n5.  Observe  that 


1  \  1 


m5n6  ' 


Pr 


IN 


i<m 


n- 


so  that  almost  all  of  the  probability  mass  is  concentrated  inside  the  sets  W,  and  the 
probability  mass  is  uniformly  distributed.  We  also  set 


and 


Prhl  = 

M'/]  = 


1  -  x 
n3k 


x 


n3m 


where  0  <  x  <  1  will  be  defined  later.  First  notice  that 


YJT,+YJti=k 

j<k  j<m 


1  -  X\  IX 

lPk'  +  m 


n3m  J  n3 
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so  our  probability  distribution  is  well  defined.  Suppose  that  there  is  a  set  cover 
C  c  [m]  s.t.  |C|  <  k  -  1  A  U(6c  S;  =  U,  and  consider  the  solution  jAq-  We  cover  all 
W/s  and  use  at  most  k  —  1  t's.  Hence, 

V  (K&c)  <  ((*  -  1)  Pr[f]  +  Pr[x])  x  j  . 

Suppose  that  there  is  no  set  cover  of  size  k.  For  every  set  of  k  or  more  rules  S  we 
have  at  least  k  t's  in  our  solution  so 


p(k,& ls)  >  kPr[t] . 


For  every  set  of  rules  S  that  does  not  cover  all  the  W,'s  we  have  at  most  (l  —  ^^1  — 
fraction  of  the  total  probability  mass  so 


P  (k,  Jls)  > 


(( k  -  l)Pr[x]  +  Pr[f]) 


It  suffices  to  select  x  s.t. 


((*-l)Pr[f]  +  Pr[T])| 


IT 


<  mm 


((k  -  1)  Pr[x]  +  Pr[f]) 


or  — after  some  algebraic  manipulation  —  equivalently. 


,kPr[t]}  , 


l1  “  ) 


Pr[x]  <  Pr[f]  <  b  =  Pr[x] 


(k-2)  +  -^ 


n- 1 


(k-  2)-^t 


Observe  that  a  <  Pr[f]  <  b  so  it  suffices  to  set  x  s.t.  Pr[f]  =  q^L.  We  can  solve  for  x 
to  get 


m  (-3  +  2 n  +  2 n3  -  2n4  +  k  (n  -  l)2  (1  +  n  +  rz2)) 

m  (-3  +  2 n  +  2 n3  -  2 rz4)  +  k2  (2-2 n  —  n3  +  n4)  +  k  (-2  +  4 n  +  n3  -  2n4  +  m(n  -  l)2  (1  +  n  +  n2fj 

□ 


Reminder  of  Claim  5.  Pr  [3/,  BAD/ ]  <  5  . 
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Proof  of  Claim  5.  By  the  union  bound  it  suffices  to  show  that 

c 

Pr  [ BADi ]  <  —  . 

m 

Our  first  step  is  to  divide  the  passwords  w  €  P  into  buckets  Bj  based  on  their 
probability.  For  j  >  0  we  define 

Bi  =  {w  |-  <  Pr  [zv  |  Pls]  <  , 

and  for  j  =  0  we  set 

B0  =  {zv\e  <  Pr[w\^lSi]}  . 

Observe  that 

oo 

p=lX 

i= o 


Let  zv  G  Bj  be  given  ( j  >  0)  then  by  the  Chernoff  Bounds: 


4 m\\  4  2  be 


Pr  [sw  >  s  Pr  [w  |  ftSi]  +  se/2]  <  exp  \^-2]  log  jj  <  - 
Notice  that  the  bucket  Bj  contains  at  most  |By|  =  2-7 e  passwords. 


r  i  4~'  6e 

Pr  G  Bj,sw  >  s  Pr  [w  \  Pis]  +  se/2j  < - — 

Now  if  we  union  bound  across  all  j  >  0  we  get 

oo  oo 

Pr  3iv  G  Bj,  sw  >  s  Pr  [zv  \  Pls]  +  se/2  <  ^ 


i\  5 
_  2  i+1m 


b  _  _b_ 
2  i+1m  2m 


Finally,  we  consider  the  passwords  in  Bo-  By  Chernoff  Bounds  for  each  zv  G  Bo  we 
have 

Pr  [| sw  -  s  Pr  [zv  \  Pls]\  >  se/2]  <  ^  , 
by  applying  the  union  bound  |B0|  <  1/e  we  get 

c 

Pr  g  B0  |sw  -  s  Pr  [zv  \  bfts]\  >  se/2]  <  —  . 

Combining  our  inequalities  we  obtain  the  desired  result: 


Pr  [BAD,]  <  Pr  3zv  G  Bj,  sw  >  s  Pr  [zv  \  Pis]  +  se/2 
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9.2  Impossibility  of  constant-factor  universal  approx¬ 
imation 

In  this  section  we  consider  the  following  goal:  given  a  constant  c  find  a  password 
composition  policy  such  that 

pFk,3\ l)  <  c  ■  p  (k,  JF)  , 

for  any  other  policy  J\!  and  every  value  of  k  <  N.  Such  a  policy  —  if  it  exists  — 
would  provide  a  nearly  optimal  defense  against  both  online  attacks  and  dictionary 
attacks  simultaneously  [136].  Unfortunately,  Theorem  32  rules  out  the  possibility 
of  a  constant  universal  approximation  in  the  rankings  model.  Our  impossibility 
result  holds  even  in  the  singleton  rules  setting.  We  show  that  it  is  possible  to 
construct  a  distribution  D  over  rankings  for  which  no  universal  approximation 
exists. 

We  construct  our  distribution  D  (algorithm  9.1)  over  rankings  by  merging  two 
distributions  D\  and  D2  over  preference  lists. 

Intuition:  Passwords  sampled  from  D?  are  highly  secure,  but  passwords  sam¬ 
pled  from  Di  are  highly  insecure.  To  make  improve  the  security  of  D i  it  is  necessary 
to  ban  all  passwords  in  W,  but  this  reduces  the  security  of  D2  significantly. 

We  make  two  claims  (1)  We  must  ban  all  but  a  small  subset  of  passwords  if  we 
want  to  even  approximately  optimize  p  (l,Jd).  (2)  We  must  keep  a  larger  subset  of 
passwords  to  even  approximately  optimize  p  (7c,  jfl)  for  large  values  of  k. 

Theorem  32.  For  all  constants  c  >  0  there  exists  distribution  D  over  rankings  such  that 
c  P,  dyp,  k  £  N,  such  that 


p(k,3\)  >  c  -p{k,3{')  . 

Proof,  (sketch)  Let  P  =  W  U  X  where  W  =  IJ'=1  W,  —  W,-  =  {iVif i,  . . .  ,vo if  —  and 
X  =  {x\,  . . . ,  xf  are  two  disjoint  sets  of  passwords,  where  the  parameters  are  set  as 
follows  cj  =  fcr  t  =  L  =  log N  and  r  =  X-k  Our  distribution  over  preference  lists  is 
given  by  algorithm  9.1. 

There  are  two  cases  to  consider: 

Case  1:  3x  e  W  -  FFK  then  it  is  easy  to  see  that 

p(  1,^)  -  7  =  y  =  Y  ~  2cXP(1'X)  • 
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Algorithm  9.1  Sample  T) 

Input: 

Parameters  L,  r,  q,  t 
Random  Number  u  £  [0, 1]. 

Random  Permutation  7i,  of  W,  for  each  i  £  {1, r } 
Random  Permutation  pzx  over  X 
Random  Permutation  up  of  P 
Initialize:  l  <—  empty  ranking 

if  u  <  q  then 

for  i  =  1  — »  r  do 

£^(£,nw)  >  Apper 

£  <-  (£,  nx)  >  Appe 

else 


>  Select  from  D i 


>  Append  random  permutation  of  Wj 

>  Append  random  permutation  of  X 
>  Select  from  D2 


£  <—  Tip 

return  £ 


Case  2:  Suppose  that  Vx  £  W  we  have  x  t  3K  and  consider  k  =  L  with  the 
solution  P  —  don't  ban  any  passwords.  For  the  solution  P  we  have 

_  q  1  -q 

Pi~  t  +  |X|  +  |W|  ' 

for  i  <  t  (e.g.,  for  the  t  the  passwords  in  Wi),  and 

1-q 


|X|  +  |W|  ' 


for  i  >  t. 


P(k,p)  =  + 

=  c{q  +  0--q) 


t  |X|  +  |W| 


|X|  +  |W| 


V'  v  L  +  10r 

2  \  2  /  L  +  10'' 

<  1  =  p  (k,  £K)  . 


224 


Chapter  10 

Appendix:  GOTCHA  Password 
Hackers 


225 


10.1  Missing  Proofs 


Reminder  of  Claim  6.  If  (Gi,  G2)  is  a  (a,  j B,  e,  5,  / i)-GOTCHA  then  at  least  f3- fraction  of 
humans  can  successfully  authenticate  using  protocol  6.3.2  after  creating  an  account  using 
protocol  6.3.1. 

Proof  of  Claim  6.  A  legitimate  user  H  e  <H  will  use  the  same  passwords  in  protocols 
6.3.1  and  6.3.2.  Hence, 

r[  =  Extract  (pw',r')  =  Extract  (pw,r')  =  r\ , 

and  the  final  matching  challenge  cn  is  the  same  one  that  would  be  generated  by 
G2(lk,r1/H(G1(lk,r1,r2)/a0)).  If  cn  is  consistently  solvable  with  accuracy  a  by  H 
—  by  definition  15  this  is  the  case  for  at  least  /3-fraction  of  users  —  then  it  follows 
that 

dk(n,n',at)  <  a, 

where  H  (Gi  (lk,  r\,  r2)).  For  some  7i0  (namely  7Zo  =  n)  s.t.  dk  (no,  n')  <  a  it  must  be 
the  case  that 


hpWi o  =  h  ( u ,  s,  pzv',  7i0(l), ...,  n0(k )) 

=  h(u,s,pw,n(l),  ...,n(k)) 

=  hpw , 

and  protocol  6.3.2  accepts. 

Reminder  of  Claim  7.  For  all  permutations  n  :  [k]  — >  [k]  and  a  >  0 

\{n'  |  dk  (n,  n')  <  a}\  <  1  +  ^  i\ . 


□ 


Proof  of  Claim  7.  It  suffices  to  show  that  (*)_/!  >  |{7i'  |  dk  (n,  n')  =  ;}|.  We  first  choose 
the  j  unique  indices  i\, . . . ,  ij  on  which  n  and  n'  differ  —  there  are  (k-)  ways  to 
do  this.  Once  we  have  fixed  our  indices  i\, . . . ,  ij  we  define  n'  (k)  =  n  (k)  for  each 
k  i  {h,...,ij}.  Now  j\  upper  bounds  the  number  of  ways  of  selecting  the  remaining 
values  n'  (ik)  s.t.  n  (4)  +  n'  (4)  for  all  k  <  j.  □ 


226 


10.2  HOSP:  Pre-Generated  CAPTCHAs 


The  HOSP  construction  proposed  by  [51]  was  to  simply  fill  several  high  capacity 
hard  drives  with  randomly  generated  CAPTCHAs  —  discarding  the  solutions. 
Once  we  have  compiled  a  database  large  D  of  CAPTCHAs  we  can  use  algorithm 
10.1  as  our  challenge  generator  —  simply  return  a  random  CAPTCHA  from  D.  The 
advantage  of  this  approach  is  that  we  can  make  use  of  already  tested  CAPTCHA 
solutions  so  there  is  no  need  to  make  hardness  assumptions  about  new  AI  prob¬ 
lems.  The  primary  disadvantage  of  this  approach  is  that  the  size  of  the  database 
D  will  be  limited  by  economic  considerations  —  storage  isn't  free.  While  |D|  the 
number  of  CAPTCHAs  that  could  be  stored  on  a  hard  drive  may  be  large,  it  is  not 
exponentially  large.  An  adversary  could  theoretically  pay  humans  to  solve  every 
puzzle  in  D  at  which  point  the  scheme  would  be  completely  broken. 


Algorithm  10.1  GenerateChallenge 

Input:  Random  bits  r  £  {0, 1}",  Database  D  =  {Pi,  ...,P24  of  CAPTCHAs 

return  Pr 


Economic  Cost  Suppose  that  two  4  TB  hard  drives  are  filled  will  text  CAPTCHAS 
1 .  Let  S  be  the  space  required  to  store  one  CAPTCHA,  and  let  CH  denote  the  cost 
of  paying  a  human  to  solve  a  CAPTCHA.  We  use  the  values  S  =  8  KB 1  2  and 
Ch  =  $0,001  3 *.  In  this  case  \D\  =  |||  ~  109  so  we  can  store  a  billion  unsolved 
CAPTCHAs  on  the  hard  drives.  It  would  cost  the  adversary  |D|  Ch  =  $1, 000, 000 
to  solve  all  of  the  CAPTCHAs  —  or  $500, 000  to  solve  half  of  them.  The  up  front 
cost  of  this  attack  may  be  large,  but  once  the  adversary  has  solved  the  CAPTCHAs 
he  can  execute  offline  dictionary  attacks  against  every  user  who  had  an  account  on 
the  server.  Many  server  breaches  have  resulted  in  the  release  of  password  records 
for  millions  of  accounts  [5,  9, 11, 13].  If  each  cracked  password  is  worth  between 
$4  and  $30  [79]  then  it  may  be  easily  worth  the  cost  to  pay  humans  to  solve  every 
CAPTCHA  in  D. 

1  At  the  time  of  submission  a  4  TB  hard  drive  can  be  purchased  on  Amazon  for  less  than  $162. 

2The  exact  value  of  S  may  vary  slightly  depending  on  the  particular  method  used  to  generate 
the  CAPTCHA.  When  we  compressed  a  text  CAPTCHA  using  popular  GIF  format  the  resulting 
files  were  consistently  8  KB. 

3Motoyama  et  al.  estimated  that  spammers  paid  humans  $1  to  solve  a  thousand  CAPTCHAs 

[110] 
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